How to grab a chunk of data from a file?

2023-02-14 11:15 问答作者：

I want to grab a chunk of data from a file. I know the start line and the end line. I wrote the code but its incomplete and I don't know how to solve it further.

file = open(filename,'r')
    end_line='### Leave a comment!'
star_line = 'Kill the master'
    for line in file:
  开发者_如何学Python          if star_line in line:   
        ??

startmarker = "ohai"
endmarker = "meheer?"
marking = False
result = []

with open("somefile") as f:
  for line in f:
    if line.startswith(startmarker): marking = True
    elif line.startswith(endmarker): marking = False

    if marking: result.append(line)

if len(result) > 1:
  print "".join(result[1:])

Explanation: The with block is a nice way to use files -- it makes sure you don't forget to close() it later. The for walks each line and:

starts outputting when it sees a line that starts with 'ohai' (including that line)
stops outputting when it sees a line that starts with 'meheer?' (without outputting that line).

After the loop, result contains the part of the file that is needed, plus that initial marker. Rather than making the loop more complicated to ignore the marker, I just throw it out using a slice: result[1:] returns all elements in result starting at index 1; in other words, it excludes the first element (index 0).

Update to reflect add partial-line matches:

startmarker = "ohai"
endmarker = "meheer?"
marking = False
result = []

with open("somefile") as f:
  for line in f:
    if not marking:
      index = line.find(startmarker)
      if index != -1:
        marking = True
        result.append(line[index:])
    else:
      index = line.rfind(endmarker)
      if index != -1:
        marking = False
        result.append(line[:index + len(endmarker)])
      else:
        result.append(line)

print "".join(result)

Yet more explanation: marking still tells us whether we should be outputting whole lines, but I've changed the if statements for the start and end markers as follows:

if we're not (yet) marking, and we see the startmarker, then output the current line starting at the marker. The find method returns the position of the first occurrence of startmarker in this case. The line[index:] notation means 'the content of line starting at position index.
while marking, just output the current line entirely unless it contains endmarker. Here, we use rfind to find the rightmost occurrence of endmarker, and the line[...] notation means 'the content of line up to position index (the start of the match) plus the marker itself.' Also: stop marking now :)

if reading the whole file is not a problem, I would use file.readlines() to read in all the lines in a list of strings.

then you can use list_of_lines.index(value) to find the indices of the first and last line, and then select all the lines between these two indices.

First, a test file (assuming Bash shell):

for i in {0..100}; do  echo "line $i"; done > test_file.txt

That generates a file a 101 line file with lines line 0\nline 1\n ... line 100\n

This Python script captures the line between and including mark1 up to and not including mark2:

#!/usr/bin/env python

mark1 = "line 22"
mark2 = "line 26"
record=False
error=False
buf = []

with open("test_file.txt") as f:
  for line in f:
    if mark1==line.rstrip(): 
        if error==False and record==False: 
            record=True

    if mark2==line.rstrip(): 
        if record==False:
            error=True
        else:
            record=False

    if record==True and error==False: 
        buf.append(line)

if len(buf) > 1 and error==False:
    print "".join(buf)
else:
    print "There was an error in there..."

Prints:

line 22
line 23
line 24
line 25

in this case. If both marks are not found in the correct sequence, it will print an error.

If the size of the file between the marks is excessive, you may need some additional logic. You can also use a regex for each line instead of an exact match if that fits your use case.

继续阅读：python

How to grab a chunk of data from a file?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？