开发者

Hashing large files on background thread

I have a Windows Forms application that hashes files asynchronously using a BackgroundWorker. I've implemented cancellation by checking for CancellationPending between each file being hashed. The hashing itself is essentially this:

var sha1 = new SHA1CryptoServiceProvider();
byte[] hash = s开发者_Python百科ha1.ComputeHash(
    new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite));

The only problem with that is for particularly large files - hundreds of megabytes or gigabytes in size - the hashing operation blocks the cancellation until it is complete for that file.

What would be the best way to modify this so that cancellation could be checked while the file is being hashed - for example every N milliseconds or every N bytes?


You can create you own cancellable stream and provide this as the input to the hashing function. Something along these lines:

class CancellableFileStream : FileStream {

  readonly BackgroundWorker backgroundWorker;

  public CancellableFileStream(BackgroundWorker backgroundWorker, String path, FileMode mode, FileAccess access, FileShare share)
    : base(path, mode, access, share) {
    this.backgroundWorker = backgroundWorker;
  }

  public override Int32 Read(Byte[] array, Int32 offset, Int32 count) {
    if (this.backgroundWorker.CancellationPending)
      return 0;
    return base.Read(array, offset, count);
  }

}


Use TransformBlock and TransformFinalBlock instead of ComputeHash, pumping data from your stream to the hash algorithm manually - then insert a cancellation check in your loop.


SHA1 is chunk-friendly. Read the file by chunks, use TransformBlock(), then TransformFinalBlock() when the file ends.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜