开发者

fastest way to split a text file at spaces in Javascript

I'm looking at doing some text processing in the browser and am trying to get a rough idea of whether I am going to be CPU bound or I/O bound. To test the speed on the CPU side of the equation, I am seeing how quickly I can split a piece of text (~8.9MB - it's Project Gutenberg's Sherlock Holmes repeated a number of times over) in Javascript once it is in memory. At the moment I'm simply doing:

pieces = theText.split(" ");

and executing it 100 times and taking the average. On a 2011 Macbook Pro i5, the average split in Firefox takes 92.81ms and in Chrome 237.27ms. So 1000/92.81ms * 8.9MB = 95.8MBps on the CPU, which is probably a little faster than the harddisk I/O, but not by much.

So my question is really three parts:

  • Are there Javascript alternatives to split() that tend to be faster when doing simple text processing (e.g. splitting at spaces, newlines, etc. etc.)?
  • Are the lackluster CPU results I'm seeing here likely due to fundamental string matching/algorithmic constraints, or is the Javascript execution just slow?
  • If you think Javascript is likely the limiting factor, can you demonstrate substantially better performance on a comparable machine/comparable text in any other programming language?

Edit: I also suspect this could be sped up with We开发者_如何学PythonbWorkers, though for now am primarily interested in single-threaded approaches.


As far as i know split with for loop is the fastest way to do simple text processing in javascript. It is faster than regex, here is the link to jsperf http://jsperf.com/query-str-parsing-regex-vs-split/2

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜