开发者

Use of 64-bit types?

I am writing some hash functions for a compiler and I use the __int64 datatype frequently. The compiler is intended to be supported (and so far is) on different OS's. I know that __int64 is a type that can be compiled by most major C++ compilers for my target systems so that's not the problem. 开发者_如何学GoI am using hash functions to make large character strings smaller and quicker to compare and they work wonders on 64-bit capable OS's; but would there be a large enough performance decrease on 32 bit OS's to cancel out the benefits? I could use 32 bit integers but then it would greatly lessen the effectiveness of the hash functions.

Edit: It is custom code and very simple. The first hash function generates a unique 64-bit int from 12 alphanumeric (including underscore) characters. Then a class handles hashes over 12 characters by creating address-linked lists of 64bit hashes and overloads the comparison operators. The overloaded compares are short circuited and compare down the address-linked list. I've ran tests on my machine to compare speed of randomly generate large hashes (100 - 300 characters) compared to themselves (worst-case senario) and it proved to be faster than string compares. In order to better simulate the overhead of generating hashes, I've also ran compare tests of pre-generated large hashes compares against them selves. This is all running with code optimization turned off. With ~1 billion hash compares vs. ~1 billion string compares, the hash took around 16% of the time. This was all in a 64 environment though. I don't have a 32-bit machine to run tests with


64bit sized integers aren't substantially slower at all on a 32bit x86 architecture. They're not as fast as 32bit ints, obviously, but aren't notably slower. It's not at all reckless to use a 64bit int for hashes regardless of x86 or x64. The additional overhead will likely be minimal compared to say, a couple of unneeded dynamic allocations or failed algorithms.


I don't think that comparing four 32-bit variables will be faster than comparing two 64-bit variables, since I guess the compiler will generate the fastest code: if your processor doesn't support 64-bit operations, your compiler will generate code that compares it in two steps, just like you would do by hand.
This of course depends on your compiler.


Anyway, there are other tools that will make your comparisons even faster, but which are not available everywhere, for example vectorial operations (provided by SSE extensions) that allow to compare even 8*4 bytes at once.

If you need to optimize your code as much as possible I'd suggest you to add some preprocessor directives in order to enable optimizations only when the system supports them.


Are you sure it would greatly lessen the effectiveness of the hash function? Have you run tests? Certainly 64 bits is a better hash than 32 bits if (i) the number of items hashed is significantly more than 2^16 and (ii) computing the 64-bit hash is cheap. Which of (i) or (ii) (or both) is true in your case? If performance is important, you might want to use different hash functions depending on the underlying operating system. Otherwise, I would say: write a 32-bit version, and a 64-bit version; try them both out on a 64-bit system, and a 32-bit system; and you'll see whether it's worth busting a gut over.


All hash function that I've used return the value in an array of bytes (uchar) to avoid your problem.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜