开发者

Split output of command by columns using Bash?

I want to do this:

  1. run a command
  2. capture the output
  3. select a line
  4. select a column of that line

Just as an example, let's say I want to get the command name from a $PID (please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).

开发者_如何学编程

If I run ps I get:


  PID TTY          TIME CMD
11383 pts/1    00:00:00 bash
11771 pts/1    00:00:00 ps

Now I do ps | egrep 11383 and get

11383 pts/1    00:00:00 bash

Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:

<absolutely nothing/>

The problem is that cut cuts the output by single spaces, and as ps adds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cut picks an empty string. Of course, I could use cut to select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.


One easy way is to add a pass of tr to squeeze any repeated field separators out:

$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4


I think the simplest way is to use awk. Example:

$ echo "11383 pts/1    00:00:00 bash" | awk '{ print $4; }'
bash


Please note that the tr -s ' ' option will not remove any single leading spaces. If your column is right-aligned (as with ps pid)...

$ ps h -o pid,user -C ssh,sshd | tr -s " "
 1543 root
19645 root
19731 root

Then cutting will result in a blank line for some of those fields if it is the first column:

$ <previous command> | cut -d ' ' -f1

19645
19731

Unless you precede it with a space, obviously

$ <command> | sed -e "s/.*/ &/" | tr -s " "

Now, for this particular case of pid numbers (not names), there is a function called pgrep:

$ pgrep ssh


Shell functions

However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read command:

$ <command> | while read a b; do echo $a; done

The first parameter to read, a, selects the first column, and if there is more, everything else will be put in b. As a result, you never need more variables than the number of your column +1.

So,

while read a b c d; do echo $c; done

will then output the 3rd column. As indicated in my comment...

A piped read will be executed in an environment that does not pass variables to the calling script.

out=$(ps whatever | { read a b c d; echo $c; })

arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]}     # will output 'b'`


The Array Solution

So we then end up with the answer by @frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *} and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).


try

ps |&
while read -p first second third fourth etc ; do
   if [[ $first == '11383' ]]
   then
       echo got: $fourth
   fi       
done


Your command

ps | egrep 11383 | cut -d" " -f 4

misses a tr -s to squeeze spaces, as unwind explains in his answer.

However, you maybe want to use awk, since it handles all of these actions in a single command:

ps | awk '/11383/ {print $4}'

This prints the 4th column in those lines containing 11383. If you want this to match 11383 if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.


Using array variables

set $(ps | egrep "^11383 "); echo $4

or

A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}


Similar to brianegge's awk solution, here is the Perl equivalent:

ps | egrep 11383 | perl -lane 'print $F[3]'

-a enables autosplit mode, which populates the @F array with the column data.
Use -F, if your data is comma-delimited, rather than space-delimited.

Field 3 is printed since Perl starts counting from 0 rather than 1


Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:

command|head -n 6|tail -n 1|awk '{print $4}'


Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.

ps -o cmd= -p 12345

You get the cmmand line of a process with the pid specified and nothing else.

This is POSIX-conformant and may be thus considered portable.


Bash's set will parse all output into position parameters.

For instance, with set $(free -h) command, echo $7 will show "Mem:"

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜