Format a file in Unix/Linux ?

2023-01-24 16:47 问答作者：

I have a file containing country, catalog number, year, description and price

Kenya 563-45 1995 Heron Plover Thrush Gonolek Apalis $6.60
Surinam 632-96 1982 Butterfliers $7.50
Seychelles 831-34 2002 WWF Frogs set of 4 $1.40
Togo 1722-25 2010 Cheetah, Zebra, Antelope $5.70

File isn't delimited by a "tab" or ":" anything. There is only spaces between them. can you ple开发者_如何学JAVAase tell me how can I format this file(using awk ?) and how can I find the total price from this.

With command line perl:

$ cat /your/file | perl -e '$sum=0; for(<STDIN>) { $sum += $1 if(/\$([\d\.]+)/); }; print "$sum\n"'
21.2

and awk (assumes you have dollars at the end of each line):

$ cat /your/file | awk '{s+=substr($NF,2)} END{ print s}'
21.2

Also, in response to the comment. If you want to reformat on the command line:

$ cat /your/file | perl -e 'for(<STDIN>){@a=split /\s+/; $p=pop @a; \
  $line=join "|", ($a[0],$a[1],$a[2], (join" ",@a[3..$#a]) ,$p); print "$line\n"}'

Kenya|563-45|1995|Heron Plover Thrush Gonolek Apalis|$6.60
Surinam|632-96|1982|Butterfliers|$7.50
Seychelles|831-34|2002|WWF Frogs set of 4|$1.40
Togo|1722-25|2010|Cheetah, Zebra, Antelope|$5.70

If you want to do this properly, I'd do it not on the cmd line, but write a proper program to parse it.

I thought first 3 and last column is fixed meaning but middle columns are not fixed. So middle columns are kept at last with space between and fixed columns are seperated by tab so that you can start to edit it with some spreadsheet program:

awk '{ printf("%s\t%s\t%s\t%s\t", $1, $2, $3, $NF); for(i=4; i<NF; i++){ printf("%s ", $i); } printf("\n") }' < yourlist.txt

For conformity, a regexp-fu solution:

$ perl -lne '/^ (.+?) \s+ (\d+-\d+) \s+ (\d{4}) \s+ (.+?) \s+ ( \$ ( \d+ (?:\.\d+)? ) ) \s* $/x and $t+=$6, print join "•",$1,$2,$3,$4,$5 }{ print $t' input_file
Kenya•563-45•1995•Heron Plover Thrush Gonolek Apalis•$6.60
Surinam•632-96•1982•Butterfliers•$7.50
Seychelles•831-34•2002•WWF Frogs set of 4•$1.40
Togo•1722-25•2010•Cheetah, Zebra, Antelope•$5.70
21.2

Expanding upon udslk's answer, awk is certainly your friend here:

#!/usr/bin/env awk -f
BEGIN {
    print "country, \"catalog number\", year, description, \"price ($)\""
}

{
    description = $4
    for (f = 5; f < NF; ++f) {
        description = description " " $f
    }
    price = substr($NF, 2)
    total += price

    printf "\"%s\", \"%s\", \"%s\", \"%s\", %0.2f\n", $1, $2, $3, description, price
}

END {
    printf "Total, , , , %0.2f\n", total
}

This spits out a CSV file with headers, which you can import into your favourite spreadsheet. It also adds the total. Switch commas with tabs according to taste.

继续阅读：shell

Format a file in Unix/Linux ?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？