How do I limit memory use of CHCSVParser?

2023-01-28 16:37 问答作者：

I'm trying to import a 30mb CSV file into Core Data using the CHCSVParser from https://github.com/davedelong/CHCSVParser

It works, it was quite easy to setup, but it eats up a lot of memory as it's parsing through the file. The excessive memory usage seems to be coming from the end of -nextCharacter, in particular, the call to -substringWithRange:

//return nil to indicate EOF or error
if ([currentChunk length] == 0) { return nil; }

NSRange charRange = [currentChunk rangeOfComposedCharacterSequenceAtIndex:chunkIndex];
NSString * nextChar = [currentChunk substringWithRange:charRange];
ch开发者_JS百科unkIndex = charRange.location + charRange.length;
return nextChar;

I was able to add an autorelease pool to the function that calls -drain every 1,000,000 characters, but then the throughput goes way down.

Does anyone have any other ideas? Dave DeLong perhaps? :-)

OK, so I checked things out and you're right, there is pretty blatant memory buildup.

I tried putting in a pool every time it began a new CSV line and then draining it when the line was done, but that proved to be ineffective with some other memory management situations.

What I ended up doing was putting a pool in the -runParseLoop method. The pool is alloc'd right before the while loop and drained right after. There's an unsigned short counter that gets incremented in the loop, and within the loop, I -drain and re-alloc the pool if the counter ever hits 0.

Essentially:

NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];
unsigned short counter = 0;
while (error == nil && 
       (currentCharacter = [self nextCharacter]) && 
       currentCharacter != nil) {
    //process the current character
    counter++;
    if (counter == 0) { //this happens every 65,536 (2**16) iterations when the unsigned short overflows
        //retain the characters that need to out-live this pool
        [pool drain];
        pool = [[NSAutoreleasePool alloc] init];
        //autorelease the characters
    }
}

[pool drain];

That's a fun exploitation of overflow, eh? :)

I tested this against a 190MB CSV file, and memory usage stayed at reasonable levels (a couple of megabytes of active memory).

These changes have been pushed to the master branch on the github page. Try them, and let me know how they work for you. If you're still having memory/performance issues, come back and we can try something else.

继续阅读：objective-c

How do I limit memory use of CHCSVParser?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？