开发者

What is the fastest method to calculate substring

I have a huge "binary" string, like: 1110 0010 1000 1111 0000 1100 1010 0111....

It's length is 0 modulo 4, and may 开发者_运维问答reach 500,000.

I have also a corresponding array: {14, 2, 8, 15, 0, 12, 10, 7, ...}

(every number in the array corresponds to 4 bits in the string)

Given this string, this array, and a number N, I need to calculate the following substring string.substr(4*N, 4), i.e.:

for N=0 the result should be 1110

for N=1 the result should be 0010

I need to perform this task many many times, and my question is what would be the fastest method to calculate this substring ?

One method is to calculate the substring straight forward: string.substr(4*N, 4). I'm afraid this one is not efficient for such huge strings.

Another method is to use array[N].toString(2) and then wrap the result with zeros if needed. I'm not sure how fast is this.

May be you have any other ideas ?


Where does the string come from? Why not represent the string not as binary, but as hex, and then you can store each four-binary-digit section as a single character? (You could obviously pack it twice that densely if you wanted, or actually now that I think of it, 4 times, since Javascript strings are 16-bit Unicode). Then finding a single group would be a single call to "charAt()", and you'd just have to expand to the binary form via a lookup table.

edit — oh well duhh, you already have an array. In that case don't do the substring work at all; it's crazy. Just grab the array element and translate it through a lookup array into the 4-binary-digit string.


You could consider representing your huge string as a Rope data structure. A rope is basically a binary tree whose leaves are arrays of characters. A node in the tree has a left child and a right child, the left child being the first part of the string, while the right child the final part.

By using a rope, substring operations become logarithmic in complexity, rather then linear, as they are for regular strings.


If you want it padded, you could do this:

var elem = array[N]
var str = "" + ((elem>>3)&1) + ((elem>>2)&1) + ((elem>>1)&1) + (elem&1);


The array already has exactly what you need, does it not, save that you need to print it in binary format. Fortunately, sprintf for javascript is available.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜