Split output of command by columns using Bash?
I want to do this:
- run a command
- capture the output
- select a line
- select a column of that line
Just as an example, let's say I want to get the command name from a $PID
(please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).
If I run ps
I get:
PID TTY TIME CMD
11383 pts/1 00:00:00 bash
11771 pts/1 00:00:00 ps
Now I do ps | egrep 11383
and get
11383 pts/1 00:00:00 bash
Next step: ps | egrep 11383 | cut -d" " -f 4
. Output is:
<absolutely nothing/>
The problem is that cut
cuts the output by single spaces, and as ps
adds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cut
picks an empty string. Of course, I could use cut
to select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.
One easy way is to add a pass of tr
to squeeze any repeated field separators out:
$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4
I think the simplest way is to use awk. Example:
$ echo "11383 pts/1 00:00:00 bash" | awk '{ print $4; }'
bash
Please note that the tr -s ' '
option will not remove any single leading spaces. If your column is right-aligned (as with ps
pid)...
$ ps h -o pid,user -C ssh,sshd | tr -s " "
1543 root
19645 root
19731 root
Then cutting will result in a blank line for some of those fields if it is the first column:
$ <previous command> | cut -d ' ' -f1
19645
19731
Unless you precede it with a space, obviously
$ <command> | sed -e "s/.*/ &/" | tr -s " "
Now, for this particular case of pid numbers (not names), there is a function called pgrep
:
$ pgrep ssh
Shell functions
However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read
command:
$ <command> | while read a b; do echo $a; done
The first parameter to read, a
, selects the first column, and if there is more, everything else will be put in b
. As a result, you never need more variables than the number of your column +1.
So,
while read a b c d; do echo $c; done
will then output the 3rd column. As indicated in my comment...
A piped read will be executed in an environment that does not pass variables to the calling script.
out=$(ps whatever | { read a b c d; echo $c; })
arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]} # will output 'b'`
The Array Solution
So we then end up with the answer by @frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *}
and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).
try
ps |&
while read -p first second third fourth etc ; do
if [[ $first == '11383' ]]
then
echo got: $fourth
fi
done
Your command
ps | egrep 11383 | cut -d" " -f 4
misses a tr -s
to squeeze spaces, as unwind explains in his answer.
However, you maybe want to use awk
, since it handles all of these actions in a single command:
ps | awk '/11383/ {print $4}'
This prints the 4th column in those lines containing 11383
. If you want this to match 11383
if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'
.
Using array variables
set $(ps | egrep "^11383 "); echo $4
or
A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}
Similar to brianegge's awk solution, here is the Perl equivalent:
ps | egrep 11383 | perl -lane 'print $F[3]'
-a
enables autosplit mode, which populates the @F
array with the column data.
Use -F,
if your data is comma-delimited, rather than space-delimited.
Field 3 is printed since Perl starts counting from 0 rather than 1
Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:
command|head -n 6|tail -n 1|awk '{print $4}'
Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.
ps -o cmd= -p 12345
You get the cmmand line of a process with the pid specified and nothing else.
This is POSIX-conformant and may be thus considered portable.
Bash's set
will parse all output into position parameters.
For instance, with set $(free -h)
command, echo $7
will show "Mem:"
精彩评论