How to crop(cut) text files based on starting and ending line-numbers in cygwin?

2023-02-25 17:59 问答作者：

I have few log files around 100MBs each. Personally I find it cumbersome to deal with such big files. I know that log lines that are interesting to me are only between 200 to 400 lines or so.

What would be a good way to extract relavant log lines from these files ie I just want to pipe the range of line numbers to another file.

Fo开发者_StackOverflowr example, the inputs are:

filename: MyHugeLogFile.log
Starting line number: 38438
Ending line number:   39276

Is there a command that I can run in cygwin to cat out only that range in that file? I know that if I can somehow display that range in stdout then I can also pipe to an output file.

Note: Adding Linux tag for more visibility, but I need a solution that might work in cygwin. (Usually linux commands do work in cygwin).

Sounds like a job for sed:

sed -n '8,12p' yourfile

...will send lines 8 through 12 of yourfile to standard out.

If you want to prepend the line number, you may wish to use cat -n first:

cat -n yourfile | sed -n '8,12p'

You can use wc -l to figure out the total # of lines.

You can then combine head and tail to get at the range you want. Let's assume the log is 40,000 lines, you want the last 1562 lines, then of those you want the first 838. So:

tail -1562 MyHugeLogFile.log | head -838 | ....

Or there's probably an easier way using sed or awk.

I saw this thread when I was trying to split a file in files with 100 000 lines. A better solution than sed for that is:

split -l 100000 database.sql database-

It will give files like:

database-aaa
database-aab
database-aac
...

And if you simply want to cut part of a file - say from line 26 to 142 - and input it to a newfile : cat file-to-cut.txt | sed -n '26,142p' >> new-file.txt

How about this:

$ seq 1 100000 | tail -n +10000 | head -n 10
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009

It uses tail to output from the 10,000th line and onwards and then head to only keep 10 lines.

The same (almost) result with sed:

$ seq 1 100000 | sed -n '10000,10010p'
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009
10010

This one has the advantage of allowing you to input the line range directly.

If you are interested only in the last X lines, you can use the "tail" command like this.

$ tail -n XXXXX yourlogfile.log >> mycroppedfile.txt

This will save the last XXXXX lines of your log file to a new file called "mycroppedfile.txt"

This is an old thread but I was surprised nobody mentioned grep. The -A option allows specifying a number of lines to print after a search match and the -B option includes lines before a match. The following command would output 10 lines before and 10 lines after occurrences of "my search string" in the file "mylogfile.log":

grep -A 10 -B 10 "my search string" mylogfile.log

If there are multiple matches within a large file the output can rapidly get unwieldy. Two helpful options are -n which tells grep to include line numbers and --color which highlights the matched text in the output.

If there is more than file to be searched grep allows multiple files to be listed separated by spaces. Wildcards can also be used. Putting it all together:

grep -A 10 -B 10 -n --color "my search string" *.log someOtherFile.txt

继续阅读：command-line cygwin

How to crop(cut) text files based on starting and ending line-numbers in cygwin?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？