C# fast crc32 calculation :
I've profiled my application with Ants and found out that > 10% is in CRC32 calculations. (The CRC32-calculation is done in plain C#)
I did some googling and learned about the following intrinsics in Visual Studio 2008 :
_mm_crc32_u8
开发者_JAVA百科_mm_crc32_u16
_mm_crc32_u32
_mm_crc32_u64
( http://msdn.microsoft.com/en-us/library/bb514036.aspx )
Can anyone tell me / show me how to use these to replace my homebrew CRC32 ?
CRC32 calculations are getting faster over the years. Part because of implementation optimizations but also due to new processor instructions becoming available. Hence this new answer to almost a decade old question!
Stephan Brumme's CRC32 page has an overview of optimizations with the last one dated 2016. FastCRC by Yuri Babich is a 2019 C# implementation of the fast C++ CRC32 algorithm "Slicing-by-16" by Stephan Brumme & Bulat Ziganshin. He claims his version is just a little bit slower (about 10%) than the native CLI C++ fast CRC32 implementation. This algorithm is the older CRC-32-IEEE.
If you have the ability to choose another variant, go for CRC-32C (Castagnoli). This is available in the Crc32C.NET package.
The polynomial in CRC-32C was shown to have better error detection properties, which is the reason for its adoption in newer standards (iSCSI, SCTP, ext4). Aside from higher reliability, CRC-32C now has the advantage of dedicated instruction on newer Intel processors. That's why it is being chosen for high-performance applications, for example Snappy compression algorithm.
Crc32.NET is a .NET safe implementation of the above Crc32C.NET by Robert Važan but for the the Crc32 algorithm.
This library contains optimizations for managed code, so, it really is faster than other Crc32 implementations. If you need exactly Crc32, this library is the best choice. This implementation was investigated as fastest from different variants. Also, it is good for x64 and for x86, so, it seems, there is no sense to do 2 different realizations.
I have no idea which of the two .NET implementations above is the fastest for the classic CRC-32-IEEE algorithm. The performance comparison table does not reference the first implementation.
The answer from Anonymous Coward points to crcutil, a high performance CRC reference implementation of a novel Multiword CRC algorithm invented by Andrew Kadatch and Bob Jenkins in early 2007. The new algorithm is heavily tuned towards modern Intel and AMD processors and is substantially faster than almost all other software CRC algorithms. Their 2010 paper Everything we know about CRC but afraid to forget is listed in the downloads. This paper shows some tricks that can be used to avoid reprocessing certain data ranges:
- Incremental CRC computation
- Changing initial CRC value
- Concatenation of CRCs
- In-place modification of CRC-ed message
- Storing CRC value after the message
So try to be smart about what needs calculating once the amount of data becomes large enough or when the environment is limited.
A C# wrapper around this might be the best solution for decent size data currently.
http://code.google.com/p/crcutil/
Crcutil library provides efficient implementation of CRC algorithms. It includes reference implementation of a novel Multiword CRC algorithm invented by Andrew Kadatch and Bob Jenkins in early 2007. The new algorithm is heavily tuned towards modern Intel and AMD processors and is substantially faster than almost all other software CRC algorithms.
Hardware-assisted CRC32C: 0.13 (Nehalem) CPU cycles per byte. 64-bit and smaller CRCs: 1.0 (Nehalem) - 1.2 (Core) CPU cycles per byte. 128-bit CRCs: 1.7 CPU cycles per byte.
Haswell's AVX2 may bring some instructions which may further improve perf, if so, would be cool if they were included in this library.
Not sure that you have to use those methods to replace your home brew. Found a good implementation for calculating CRC-32 in C# here.
You can use PInvoke (and pure c#) or create C++/CLI project and write wrapper around this functions.
Did you saw example on msdn? To compute CRC of string you need just loop through it.
Well, they're Intrinsic functions. It means you have only one option: create C++/CLI wrapper.
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论