Help removing items from a text file using python

2023-01-10 02:35 问答作者：

After implementing some of the solutions in my previous question, I've come up with the following solution:

reader = open('C://text.txt') 
writer = open('C://nona.txt', 'w')
counter = 1    
names, nums = [], []    
row = reader.read().split(' ')
x = len(row)/2
for (a, b) in [(c, d) for c, d in zip(row[:x], row[x:]) if d!='na']:
    print counter
  开发者_JS百科  counter +=1
    names.append(a)
    nums.append(b)

writer.write(' '.join(names))
writer.write(' ')
writer.write(' '.join(nums))

This program works quite well for a smaller sample data set. However it freezes up when I use the full data set and causes python to crash. Any suggestions on how I can overcome this?

What you should do is break your file up into two separate files. Your logic should do something like this:

Open data file
open name file
read next data
is it name? see 5. Otherwise see 6
write name to name file, see 3
is it number or na? close name file and open number file
read next data
is it number or na? see 7, otherwise write file

once you have your files split into two pieces, you can iterate over them together:

names = open('names.txt')
numbers = open('numbers.txt')

for name, number in zip(names, numbers):
   if not numbers == 'na':
       output.write(name + " " + number)

or you could write to two different files and then join them together if that's what you need.

Your file is organized in an unfortunate manner for Pythonic processing.

Note that when you call reader.read(), you are reading the entire file into memory. Let's say this takes up X bytes.

Calling split will effectively add another X bytes of memory usage, as it will create a new string for each separate string in the file.

Then you call row[:x] and row[x:], which will add ANOTHER X bytes (because the slice operator makes a copy).

Then you call zip, and make a list comprehension, etc, etc. Strings and tuples are immutable data, which means you are always creating them from scratch.

I would approach this problem at a lower level. Open one file descriptor and point it to the beginning of the file. Open another and have it seek to the beginning of the (na/0/1/2) values (you will know where this is by counting the spaces). Now, read one name and one value at a time, and if the value is not "na" you can write the name to an output file. If you need to write the values to the output file also, hold them in memory and write them all at once when you are done.

Unfortunately this will be more difficult to code than just using the high-level functions that Python provides (you will need to write code that operates at the character level), but as you have seen there is a price to pay for those high-level functions.

继续阅读：memory python

Help removing items from a text file using python

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

Easiest way to get words of one line from istream into a vector?

抽烟只抽炫赫门？

Infinite gtk warnings when I right click on the icon

Best solution for private video database [closed]