开发者

Format a file in Unix/Linux ?

I have a file containing country, catalog number, year, description and price

Kenya 563-45 1995 Heron Plover Thrush Gonolek Apalis $6.60
Surinam 632-96 1982 Butterfliers $7.50
Seychelles 831-34 2002 WWF Frogs set of 4 $1.40
Togo 1722-25 2010 Cheetah, Zebra, Antelope $5.70

File isn't delimited by a "tab" or ":" anything. There is only spaces between them. can you ple开发者_如何学JAVAase tell me how can I format this file(using awk ?) and how can I find the total price from this.


With command line perl:

$ cat /your/file | perl -e '$sum=0; for(<STDIN>) { $sum += $1 if(/\$([\d\.]+)/); }; print "$sum\n"'
21.2

and awk (assumes you have dollars at the end of each line):

$ cat /your/file | awk '{s+=substr($NF,2)} END{ print s}'
21.2

Also, in response to the comment. If you want to reformat on the command line:

$ cat /your/file | perl -e 'for(<STDIN>){@a=split /\s+/; $p=pop @a; \
  $line=join "|", ($a[0],$a[1],$a[2], (join" ",@a[3..$#a]) ,$p); print "$line\n"}'

Kenya|563-45|1995|Heron Plover Thrush Gonolek Apalis|$6.60
Surinam|632-96|1982|Butterfliers|$7.50
Seychelles|831-34|2002|WWF Frogs set of 4|$1.40
Togo|1722-25|2010|Cheetah, Zebra, Antelope|$5.70

If you want to do this properly, I'd do it not on the cmd line, but write a proper program to parse it.


I thought first 3 and last column is fixed meaning but middle columns are not fixed. So middle columns are kept at last with space between and fixed columns are seperated by tab so that you can start to edit it with some spreadsheet program:

awk '{ printf("%s\t%s\t%s\t%s\t", $1, $2, $3, $NF); for(i=4; i<NF; i++){ printf("%s ", $i); } printf("\n") }' < yourlist.txt


For conformity, a regexp-fu solution:

$ perl -lne '/^ (.+?) \s+ (\d+-\d+) \s+ (\d{4}) \s+ (.+?) \s+ ( \$ ( \d+ (?:\.\d+)? ) ) \s* $/x and $t+=$6, print join "•",$1,$2,$3,$4,$5 }{ print $t' input_file
Kenya•563-45•1995•Heron Plover Thrush Gonolek Apalis•$6.60
Surinam•632-96•1982•Butterfliers•$7.50
Seychelles•831-34•2002•WWF Frogs set of 4•$1.40
Togo•1722-25•2010•Cheetah, Zebra, Antelope•$5.70
21.2


Expanding upon udslk's answer, awk is certainly your friend here:

#!/usr/bin/env awk -f
BEGIN {
    print "country, \"catalog number\", year, description, \"price ($)\""
}

{
    description = $4
    for (f = 5; f < NF; ++f) {
        description = description " " $f
    }
    price = substr($NF, 2)
    total += price

    printf "\"%s\", \"%s\", \"%s\", \"%s\", %0.2f\n", $1, $2, $3, description, price
}

END {
    printf "Total, , , , %0.2f\n", total
}

This spits out a CSV file with headers, which you can import into your favourite spreadsheet. It also adds the total. Switch commas with tabs according to taste.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜