Really strange grep 2.5.1 bug in cat'd reading long lines
Recently a peer and I discovered an interesting bug in GNU grep 2.5开发者_StackOverflow社区.1 in which standard input with lines greater than 200,000,000 characters causes grep to fail, even if the pattern is not in one of the long lines. If however grep reads the file with grep match file
it works fine. It appears this bug is fixed in 2.5.3.
cat big_file | grep pattern # this dies with an exit code 0 after encountering a long line
grep pattern big_file # works fine!
Does anyone know why this happens? Is the line limitation the true cause?
There is or was a memory exhaustion problem I ran into when reading very long lines, but on most systems allocating 200MB is unlikely to fail.
http://savannah.gnu.org/bugs/?9886
I believe it uses memory-mapped files when reading directly, and obviously that's not an option when reading from a pipe, so perhaps that's the difference.
Also, how complex is your pattern? There's a known limitation of grep where the {n,m}
option with large counts can cause huge amounts of memory to be allocated.
I was looking through the commits, but I couldn't find anything. You can have a go at it.
http://git.savannah.gnu.org/cgit/grep.git/log/?ofs=200
That links to the page with 2.5.1. Go up and back from there to try to find it.
精彩评论