开发者

How to crop(cut) text files based on starting and ending line-numbers in cygwin?

I have few log files around 100MBs each. Personally I find it cumbersome to deal with such big files. I know that log lines that are interesting to me are only between 200 to 400 lines or so.

What would be a good way to extract relavant log lines from these files ie I just want to pipe the range of line numbers to another file.

Fo开发者_StackOverflowr example, the inputs are:

filename: MyHugeLogFile.log
Starting line number: 38438
Ending line number:   39276

Is there a command that I can run in cygwin to cat out only that range in that file? I know that if I can somehow display that range in stdout then I can also pipe to an output file.

Note: Adding Linux tag for more visibility, but I need a solution that might work in cygwin. (Usually linux commands do work in cygwin).


Sounds like a job for sed:

sed -n '8,12p' yourfile

...will send lines 8 through 12 of yourfile to standard out.

If you want to prepend the line number, you may wish to use cat -n first:

cat -n yourfile | sed -n '8,12p'


You can use wc -l to figure out the total # of lines.

You can then combine head and tail to get at the range you want. Let's assume the log is 40,000 lines, you want the last 1562 lines, then of those you want the first 838. So:

tail -1562 MyHugeLogFile.log | head -838 | ....

Or there's probably an easier way using sed or awk.


I saw this thread when I was trying to split a file in files with 100 000 lines. A better solution than sed for that is:

split -l 100000 database.sql database-

It will give files like:

database-aaa
database-aab
database-aac
...


And if you simply want to cut part of a file - say from line 26 to 142 - and input it to a newfile : cat file-to-cut.txt | sed -n '26,142p' >> new-file.txt


How about this:

$ seq 1 100000 | tail -n +10000 | head -n 10
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009

It uses tail to output from the 10,000th line and onwards and then head to only keep 10 lines.

The same (almost) result with sed:

$ seq 1 100000 | sed -n '10000,10010p'
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009
10010

This one has the advantage of allowing you to input the line range directly.


If you are interested only in the last X lines, you can use the "tail" command like this.

$ tail -n XXXXX yourlogfile.log >> mycroppedfile.txt

This will save the last XXXXX lines of your log file to a new file called "mycroppedfile.txt"


This is an old thread but I was surprised nobody mentioned grep. The -A option allows specifying a number of lines to print after a search match and the -B option includes lines before a match. The following command would output 10 lines before and 10 lines after occurrences of "my search string" in the file "mylogfile.log":

grep -A 10 -B 10 "my search string" mylogfile.log

If there are multiple matches within a large file the output can rapidly get unwieldy. Two helpful options are -n which tells grep to include line numbers and --color which highlights the matched text in the output.

If there is more than file to be searched grep allows multiple files to be listed separated by spaces. Wildcards can also be used. Putting it all together:

grep -A 10 -B 10 -n --color "my search string" *.log someOtherFile.txt

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜