开发者

Combine matching lines using sed or awk?

I have a file like the following:

1,  
cake:01351  
12,  
bun:1063  
scone:13581  
biscuit:1931  
14,  
jelly:1385

I need to convert it so that when a number is read at the start of a line it is combined with the line 开发者_如何学Cbeneath it, but if there is no number at the start the line is left as is. This would be the output that I need:

1,cake:01351  
12,bun:1063  
scone:13581  
biscuit:1931  
14,jelly:1385

Having a lot of trouble achieving this with sed, it seems it may not be the best way for what I think should be quite simple.

Any suggestions greatly appreciated.


A very basic sed implementation:

sed -e '/^[0-9]/{N;s/\n//;}'

This relies on the first character on only the 'number' lines being a number (as you specified).

It

  • matches lines starting with a number, ^[0-9]
  • brings in the next line, N
  • deletes the embedded newline, s/\n//


This is a file on my intranet. I can't recall where I found the handy sed one-liner. You might find something if you search for 'sed one-liner'


Have you ever needed to combine lines of text, but it's too tedious to do it by hand.

For example, imagine that we have a text file with hundreds of lines which look like this:

14/04/2003,10:27:47,0
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
14/04/2003,10:30:51,600
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.010,0.975,0.005
14/04/2003,10:34:02,600
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.011,0.975,0.005
IdVg,3.000,-1.000,0.050,0.006
GmMax,0.010,0.975,0.005

Each date (14/04/2003) is the start of a data record, and it continues on the next four lines.

We would like to input this to Excel as a 'comma separated value' file, and see each record in its own row.

In our example, we need to append any line starting with a G or I to the preceding line, and insert a comma, so as to produce the following:

14/04/2003,10:27:47,0,IdVg,3.000,-1.000,0.050,0.006,GmMax,0.011,0.975,0.005,IdVg,3.000,...  
14/04/2003,10:30:51,600,IdVg,3.000,-1.000,0.050,0.006,GmMax,0.011,0.975,0.0005,IdVg,3.000,...
14/04/2003,10:34:02,600,IdVg,3.000,-1.000,0.050,0.006,GmMax,0.011,0.975,0.0005,IdVg,3.000,...

This is a classic application of a 'regular expression' and, once again, sed comes to the rescue.

The editing can be done with a single sed command:

sed -e :a -e '$!N;s/\n\([GI]\)/,\1/;ta' -e 'P;D' filename >newfilename

I didn't say it would be obvious, or easy, did I?

This is the kind of command you write down somewhere for the rare occasions when you need it.


Try a regular expression, such as:

sed '/[0-9]\+,/{N}s/\n//)'

That checks the first line for a number (0-9) and a comma, then replaces the new line with nothing, removing it.


Another awk solution, less cryptic than some other answers:

awk '/^[0-9]/ {n = $0; getline; print n $0; next} 1'


$ awk 'ORS= /^[0-9]+,$/?" ":"\n"' file
1, cake:01351
12, bun:1063
scone:13581
biscuit:1931
14, jelly:1385
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜