开发者

In VB.NET, what data structure to use so that, given two characters, return an integer for optimum performance?

I am finishing up a program that does a large number of calculations and am trying to optimize the innermost loop.

The calculation I am currently looking at iterates over a large number of pairs of words and makes a table of the counts of corresponding pairs of characters. For example, one pair of words might be:

voice
louse

and the character pairs would then be (v,l), (o,o), (i,u), (c,s), and (e,e), and these pairs would all then have a count of 1. If the combination (v,l) is ever encountered again in another word, it would increment that count to two.

What data structure should I use for highest performance? Given the two characters, I need to retrieve the count for that pair. Currently I am using a nested hash table whose declaration looks like:

Dim data As New Dictionary(of String, Dictionary(of String, Integer))

Using this data structure, the program must hash two strings for every integer it retrieves. For every ch开发者_高级运维aracter pair, it must first check to see if the pair is in the hash table, and if not add it, requiring two more hashes. I have also considered a one level hash table with the key being the two characters concatenated together, so key = "vl" and value = 1, but I have read that string concatenation is relatively slow in VB.

So then, my questions are:

How speedy are the Dictionaries in VB? Would four hashes be quicker than one hash and a string concatenation (two level vs one level hash table)?

Can you think of a better structure to store this kind of data that allows fast additions and retrieval?


One option is to use a Dictionary(Of Integer, Integer). You can convert from any .NET Unicode character to an unsigned 16 bit integer, as they're UTF-16 code units.

You can then combine two unsigned 16 bit integers into a 32 bit integer very easily. Alternatively, you can just convert each code unit into a 32 bit unsigned integer to start with, and shift and combine in the same way :)

In C# I'd just use:

int combination = (((int) char1) << 16) | ((int) char2);

EDIT: According to jeroenh's comment, the VB equivalent is:

Dim combination As Integer = (AscW(char1) << 16) Or AscW(char2)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜