开发者

Large text file dictionary of random words for benchmarking purposes?

I was wondering if anyone could point me to a very very large dictionary of random words that could be used to test some h开发者_如何转开发igh performance string data structures? I'm finding some that are in the ~2MB range... however I'd like some larger if possible. I'm guessing there has to be some large standard string dataset somewhere that could be used. Thanks!


http://norvig.com/big.txt

The above link was mentioned in Norvig's spell checker article - http://norvig.com/spell-correct.html


I'd recommend taking a look through the material available at the TREC (Text REtrieval Conference). Some good datasets which might meet your requirements.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜