Rearrange columns using cut
I am having a file in the following format
Column1 Column2 str1 1 str2 2 str3 3
I want the columns to be rearranged. I tried below command
cut 开发者_Python百科-f2,1 file.txt
The command doesn't reorder the columns. Any idea why its not working?
For the cut(1)
man page:
Use one, and only one of -b, -c or -f. Each LIST is made up of one range, or many ranges separated by commas. Selected input is written in the same order that it is read, and is written exactly once.
It reaches field 1 first, so that is printed, followed by field 2.
Use awk
instead:
awk '{ print $2 " " $1}' file.txt
You may also combine cut
and paste
:
paste <(cut -f2 file.txt) <(cut -f1 file.txt)
via comments: It's possible to avoid bashisms and remove one instance of cut by doing:
paste file.txt file.txt | cut -f2,3
Using join
:
join -t $'\t' -o 1.2,1.1 file.txt file.txt
Notes:
-t $'\t'
In GNUjoin
the more intuitive-t '\t'
without the$
fails, (coreutils v8.28 and earlier?); it's probably a bug that a workaround like$
should be necessary. See: unix join separator char.Even though there's just one file being worked on,
join
syntax requires two filenames. Repeating the file name allowsjoin
to perform the desired action.For systems with low resources
join
offers a smaller footprint than some of the tools used in other answers:wc -c $(realpath `which cut join sed awk perl`) | head -n -1 43224 /usr/bin/cut 47320 /usr/bin/join 109840 /bin/sed 658072 /usr/bin/gawk 2093624 /usr/bin/perl
using just the shell,
while read -r col1 col2
do
echo $col2 $col1
done <"file"
You can use Perl for that:
perl -ane 'print "$F[1] $F[0]\n"' < file.txt
- -e option means execute the command after it
- -n means read line by line (open the file, in this case STDOUT, and loop over lines)
- -a means split such lines to a vector called @F ("F" - like Field). Perl indexes vectors starting from 0 unlike cut which indexes fields starting form 1.
- You can add -F pattern (with no space between -F and pattern) to use pattern as a field separator when reading the file instead of the default whitespace
The advantage of running perl is that (if you know Perl) you can do much more computation on F than rearranging columns.
Just been working on something very similar, I am not an expert but I thought I would share the commands I have used. I had a multi column csv which I only required 4 columns out of and then I needed to reorder them.
My file was pipe '|' delimited but that can be swapped out.
LC_ALL=C cut -d$'|' -f1,2,3,8,10 ./file/location.txt | sed -E "s/(.*)\|(.*)\|(.*)\|(.*)\|(.*)/\3\|\5\|\1\|\2\|\4/" > ./newcsv.csv
Admittedly it is really rough and ready but it can be tweaked to suit!
Just as an addition to answers that suggest to duplicate the columns and then to do cut
. For duplication, paste
etc. will work only for files, but not for streams. In that case, use sed
instead.
cat file.txt | sed s/'.*'/'&\t&'/ | cut -f2,3
This works on both files and streams, and this is interesting if instead of just reading from a file with cat
, you do something interesting before re-arranging the columns.
By comparison, the following does not work:
cat file.txt | paste - - | cut -f2,3
Here, the double stdin placeholder paste
does not duplicate stdin, but reads the next line.
Using sed
Use sed
with basic regular expression's nested subexpressions to capture and reorder the column content. This approach is best suited when there are a limited number of cuts to reorder columns, as in this case.
The basic idea is to surround interesting portions of the search pattern with \(
and \)
, which can be played back in the replacement pattern with \#
where #
represents the sequential position of the subexpression in the search pattern.
For example:
$ echo "foo bar" | sed "s/\(foo\) \(bar\)/\2 \1/"
yields:
bar foo
Text outside a subexpression is scanned but not retained for playback in the replacement string.
Although the question did not discuss fixed width columns, we will discuss here as this is a worthy measure of any solution posed. For simplicity let's assume the file is space delimited although the solution can be extended for other delimiters.
Collapsing Spaces
To illustrate the simplest usage, let's assume that multiple spaces can be collapsed into single spaces, and the the second column values are terminated with EOL (and not space padded).
File:
bash-3.2$ cat f
Column1 Column2
str1 1
str2 2
str3 3
bash-3.2$ od -a f
0000000 C o l u m n 1 sp sp sp sp C o l u m
0000020 n 2 nl s t r 1 sp sp sp sp sp sp sp 1 nl
0000040 s t r 2 sp sp sp sp sp sp sp 2 nl s t r
0000060 3 sp sp sp sp sp sp sp 3 nl
0000072
Transform:
bash-3.2$ sed "s/\([^ ]*\)[ ]*\([^ ]*\)[ ]*/\2 \1/" f
Column2 Column1
1 str1
2 str2
3 str3
bash-3.2$ sed "s/\([^ ]*\)[ ]*\([^ ]*\)[ ]*/\2 \1/" f | od -a
0000000 C o l u m n 2 sp C o l u m n 1 nl
0000020 1 sp s t r 1 nl 2 sp s t r 2 nl 3 sp
0000040 s t r 3 nl
0000045
Preserving Column Widths
Let's now extend the method to a file with constant width columns, while allowing columns to be of differing widths.
File:
bash-3.2$ cat f2
Column1 Column2
str1 1
str2 2
str3 3
bash-3.2$ od -a f2
0000000 C o l u m n 1 sp sp sp sp C o l u m
0000020 n 2 nl s t r 1 sp sp sp sp sp sp sp 1 sp
0000040 sp sp sp sp sp nl s t r 2 sp sp sp sp sp sp
0000060 sp 2 sp sp sp sp sp sp nl s t r 3 sp sp sp
0000100 sp sp sp sp 3 sp sp sp sp sp sp nl
0000114
Transform:
bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f2
Column2 Column1
1 str1
2 str2
3 str3
bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f2 | od -a
0000000 C o l u m n 2 sp C o l u m n 1 sp
0000020 sp sp nl 1 sp sp sp sp sp sp sp s t r 1 sp
0000040 sp sp sp sp sp nl 2 sp sp sp sp sp sp sp s t
0000060 r 2 sp sp sp sp sp sp nl 3 sp sp sp sp sp sp
0000100 sp s t r 3 sp sp sp sp sp sp nl
0000114
Lastly although the question's example does not have strings of unequal length, this sed
expression supports this case.
File:
bash-3.2$ cat f3
Column1 Column2
str1 1
string2 2
str3 3
Transform:
bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f3
Column2 Column1
1 str1
2 string2
3 str3
bash-3.2$ sed "s/\([^ ]*\)\([ ]*\) \([^ ]*\)\([ ]*\)/\3\4 \1\2/" f3 | od -a
0000000 C o l u m n 2 sp C o l u m n 1 sp
0000020 sp sp nl 1 sp sp sp sp sp sp sp s t r 1 sp
0000040 sp sp sp sp sp nl 2 sp sp sp sp sp sp sp s t
0000060 r i n g 2 sp sp sp nl 3 sp sp sp sp sp sp
0000100 sp s t r 3 sp sp sp sp sp sp nl
0000114
Comparison to other methods of column reordering under shell
Surprisingly for a file manipulation tool,
awk
is not well-suited for cutting from a field to end of record. Insed
this can be accomplished using regular expressions, e.g.\(xxx.*$\)
wherexxx
is the expression to match the column.Using
paste
andcut
subshells gets tricky when implementing inside shell scripts. Code that works from the commandline fails to parse when brought inside a shell script. At least this was my experience (which drove me to this approach).
Expanding on the answer from @Met, also using Perl:
If the input and output are TAB-delimited:
perl -F'\t' -lane 'print join "\t", @F[1, 0]' in_file
If the input and output are whitespace-delimited:
perl -lane 'print join " ", @F[1, 0]' in_file
Here,
-e
tells Perl to look for the code inline, rather than in a separate script file,
-n
reads the input 1 line at a time,
-l
removes the input record separator (\n
on *NIX) after reading the line (similar to chomp
), and add output record separator (\n
on *NIX) to each print
,
-a
splits the input line on whitespace into array @F
,
-F'\t'
in combination with -a
splits the input line on TABs, instead of whitespace into array @F
.
@F[1, 0]
is the array made up of the 2nd and 1st elements of array @F
, in this order. Remember that arrays in Perl are zero-indexed, while fields in cut
are 1-indexed. So fields in @F[0, 1]
are the same fields as the ones in cut -f1,2
.
Note that such notation enables more flexible manipulation of input than in some other answers posted above (which are fine for a simple task). For example:
# reverses the order of fields:
perl -F'\t' -lane 'print join "\t", reverse @F' in_file
# prints last and first fields only:
perl -F'\t' -lane 'print join "\t", @F[-1, 0]' in_file
精彩评论