开发者

Script to delete all /n number of lines starting from a word except last line

How to delete all lines below a word except last line in a file. suppose i have a file which contains

| 02/04/2010 07:24:20 | 20-24 |         26 |       13 |        2.60 | 
| 02/04/2010 07:24:25 | 25-29 |          6 |        3 |        0.60 | 
+---------------------+-------+------------+----------+-------------+

02-04-2010-07:24 --- ER GW 03

+---------------------+-------+------------+----------+-------------+
| date                | sec   开发者_StackOverflow| BOTH_MO_MT | MO_or_MT | TPS_PER_SEC |
+---------------------+-------+------------+----------+-------------+
| 02/04/2010 07:00:00 | 00-04 |         28 |       14 |        2.80 | 
| 02/04/2010 07:00:05 | 05-09 |         27 |       14 |        2.70 | 
...
...
...
...
END OF TPS PER 5 REPORT

and I need to delete all contents from "02-04-2010-07:24 --- ER GW 03" except "END OF TPS PER 5 REPORT" and save the file. This has to be done for around 700 files. all files are same format, with datemonthday filename.


sed -ni '/ER GW/ b end; p; d; :end $p; n; b end' $file

$file should be the filename. E.g.:

for file in *.txt ; do
    sed -ni '/ER GW/ b end; p; d; :end $p; n; b end' $file
done


The following awk script will do it:

awk '
    /^02-04-2010-07:24 --- ER GW 03$/ {skip=1}
                                      {ln=$0;if (skip!=1){print}}
    END                               {if (skip==1){print $ln}}'

as shown in the following transcript:

$ echo '| 02/04/2010 07:24:20 | 20-24 |         26 |       13 |        2.60 |
| 02/04/2010 07:24:25 | 25-29 |          6 |        3 |        0.60 |
+---------------------+-------+------------+----------+-------------+

02-04-2010-07:24 --- ER GW 03

+---------------------+-------+------------+----------+-------------+
| date                | sec   | BOTH_MO_MT | MO_or_MT | TPS_PER_SEC |
+---------------------+-------+------------+----------+-------------+
| 02/04/2010 07:00:00 | 00-04 |         28 |       14 |        2.80 |
| 02/04/2010 07:00:05 | 05-09 |         27 |       14 |        2.70 |
...
...
...
...
END OF TPS PER 5 REPORT' | awk '
    /^02-04-2010-07:24 --- ER GW 03$/ {skip=1}
    {ln=$0;if (skip!=1){print}}
    END {if (skip==1){print $ln}}'

which produces:

| 02/04/2010 07:24:20 | 20-24 |         26 |       13 |        2.60 |
| 02/04/2010 07:24:25 | 25-29 |          6 |        3 |        0.60 |
+---------------------+-------+------------+----------+-------------+

END OF TPS PER 5 REPORT

as requested.

Breaking it down:

  • skip is initially 0 (false).
  • if you find a line you want to start skipping from, set skip to 1 (true) - change this pattern where necessary.
  • if skip is false, output the line.
  • regardless of skip, store the last line.
  • at the end, is skip is true, output the last line (sjip check prevents double print).

For doing it to multiple files, you can just use for:

for fspec in *.txt ; do
    awk 'blah blah' <${fspec} >${fspec}.new
done

The command required for your update in the comment (searching for "--- ER GW 03") is:

awk '
    /--- ER GW 03/ {skip=1}
                   {ln=$0;if (skip!=1){print}}
    END            {if (skip==1){print $ln}}'


This might work for you:

sed -i '$q;/^02-04-2010-07:24 --- ER GW 03/,$d' *.txt
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜