read a very very big file with python

2023-02-16 14:26 问答作者：

What is the best solution to process each line of a text file whose size is abo开发者_如何学Gout 500 MB?

The proposal to which I had thought :

def files(mon_fichier):
    while True:
        data = mon_fichier.read(1024)
        if not data:
            break
        yield data

fichier = open('tonfichier.txt', 'r')
for bloc in files(fichier):
    print bloc

Thank you in advance

with open('myfile.txt') as inf:
    for line in inf:
        # do something
        pass

Just using the standard file operations should work as long as you keep away from readlines and instead just use readline.

The answer is depending what you want to do with the datas... I recommend to read by block and treat each block just after reading like :

fs = open(source, 'r')
while 1:
    txt = fs.readline(1000)
    < your treatement>
    if txt =="":
    break
fs.close()

As far as I understand the processes, the reading of a file goes through a buffer.

In this condition, mon_fichier.read(1024) don't fetch 1024 bytes directly from the file but from the buffer until this one will be exhausted, and then the buffer will be filled again by a new real reading of, say, 4096 or 8192 or 16384 or... bytes, I don't know precisely (think it's a power of 2, but even not sure)

Then, if you really want to treat blocks of bytes , I think that philnext's code is preferable. But readline(1000) must be replaced with read(1000) if you want to fetch exactly 1000 bytes; readline(1000) returns a line, and no more, even if the line is 4 characters long.

Treating a file by blocks may be what you really want to do , but it seems uncommon to me. It is more frequent to treat a file by lines, and in this case it's the Hugh Bothwell's code that is the right manner.

继续阅读：file python text

read a very very big file with python

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？