开发者

Calculating and caching folder sizes

I have an idea for a C# program that works basically like the Windows Explorer. The aim is to display all files and folders, and to show specific information for each of them. One of the features I'm planning is to detect folder sizes which is something the Explorer cannot.

My idea for the algorithm is to cumulate the sizes of all files in the specific folder. However, I'm afraid of performance issues. For example, for displaying the sizes of all folders of C: I have to consider all the files on the whole drive. This will probably take a while and thus the calculation can't be done each time the user switches to a different folder or 开发者_StackOverflow中文版back.

So I'd like to cache some of the sizes. However, when files change, are added or removed, the cache data becomes outdated. But I do not want to monitor all file changes while the program is not running.

Is there any way I can find out if the cache is up-to-date, e.g. by retrieving some sort of checksum that doesn't require calculating all sizes again? Is there another memory and CPU-efficient way to find out if file sizes have changed since the last calculation? Or is there even another possibility?


Windows Explorer has the Folder size available (# files, size on disk etc) availble for the properties of any disk/folder. Directory Properties Example

As for writing a program, you can certainly use a recurisve DirectoryInfo.EnumerateFiles() to get all the files within a disk/folder.

As for monitoring, you can use the FileSystemWatcher class to monitor changes to any disk/folder.

To keep the cache up to date is going to be difficult because:

  1. Depending on the Partition Formated Type [Fat, Fat32, NTFS, etc] you are limited to what each support.
  2. Any new file (created date > cache date) means you still have to enumerate all the files to filter the list to new files.
  3. Modified files (modified date > cache date) has the same issue.

Unless you use something VERY specific to the Formatted Type beyond what C# provides, updating a cache after the application launch will need to occur every time, and be very intense.


Windows Explorer is a pretty crafty program. It is filled with tricks that are designed to hide the fact that any file system is punishingly slow to iterate. The kind of tricks that I know about:

  • fake it. Show the folder hierarchy as a treeview and use the [+] glyph to show that a folder has files or directories inside of it. Even when it doesn't. That's visible, create an empty directory and restart your machine. Note the [+] glyph, click it and notice that, when forced to iterate the sub-directory, it smoothly change the [+] glyph to a 'nothing there' glyph.

  • delay it. Harder to see, you need a subdirectory with a lot of files. Explorer starts a background thread that iterates the content of the folder. Once it figured it out, it smoothly changes the status bar text.

  • tell me what happened. Explorer uses ReadDirectoryChangesW() heavily. Wrapped in .NET by the FileSystemWatcher class. Key point is that it gets a notification that something changed in the subdirectory that the user is looking at. No polling required, that would have horrible perf. Go back to bullet two.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜