开发者

code throws std::bad_alloc, not enough memory or can it be a bug?

I am parsing using a pretty large grammar (1.1 GB, it's data-oriented parsing). The parser I use (bitpar) is said to be optimized for highly ambiguous grammars. I'm getting this error:

1terminate called after throwing an instance of 'std::bad_alloc'
  what():  St9bad_alloc
dotest.sh: line 11: 16686 Aborted                 bitpar -p -b 1 -s top -u unknownwordsm -w pos.dfsa /tmp/gsyntax.pcfg /tmp/gsyntax.lex arbobanko.test arbobanko.results

Is there hope? Does it mean that it has ran out of memory? It uses about 15 GB before it crashes. The machine I'm using has 32 GB of RAM, plus swap as well. It crashes before outputting a single parse tree; I think it crashes after reading the grammar, during an attempt to construct a chart parse for the first sentence.

The parser is an efficient CYK chart parser using bit vector representations; I presume it is already pretty memory efficient. If it really requires too much memory I could sample from the grammar rules, but开发者_运维技巧 this will decrease parse accuracy of course.

I think the problem is probably that I have a very large number of non-terminals, I should probably try to look for a different parser (any suggestions?)

UPDATE: for posterity's sake, I found the problem a long time ago. The grammar was way too big due to a bug, so the parser couldn't handle it with the available memory. With the correct grammar (which is an order of magnitude smaller) it works fine.


It is possible that memory becomes fragmented. That means that your program can fail to allocate 1KB, even though 17 GB of memory is free, when those 17GB is fragmented into 34 million free chunks of 512 bytes each.

There's of course the possibility that your program miscalculates a memory allocation. A common bug is trying to allocate -1 bytes of memory. As memory sizes are always positive, that's interpreted as size_t(-1), much more than 32 GB. But there's really no fact which points in that direction.

To solve this problem, you will need someone who does speak C++. If it's indeed memory fragmentation, a good C++ programmer can tailor the memory allocation strategy to match your specific needs. Some strategies include keeping same-sized objects together, and replacing string by shims.


If your application uses 32Bit memory model then each process will get 4GB of virtual address space. Out of which only 2G is available for user space.

I suspect your parser might be trying to allocate more than available virtual memory. I am not sure if the Parser provides mechanism for custom memory allocation. If so, you can try using memory mapped files for allocation and bring it to memroy only when it is needed.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜