开发者

Why are file handles such an expensive resource?

In holy wars about whether garbage collection is a good thing, people often point out that it doesn't handle things like freeing file handles. Putting this logic in a finalizer is considered a bad thing because the resource then gets freed non-deterministically. However, it seems like an easy solut开发者_如何学Pythonion would be for the OS to just make sure lots and lots of file handles are available so that they are a cheap and plentiful resource and you can afford to waste a few at any given time. Why is this not done in practice?


In practice, it cannot be done because the OS would have to allocate a lot more memory overhead for keeping track of which handles are in use by different processes. In an example C code as shown below I will demonstrate a simple OS process structure stored in a circular queue for an example...

struct ProcessRecord{
  int ProcessId;
  CPURegs cpuRegs;
  TaskPointer **children;
  int *baseMemAddress;
  int sizeOfStack;
  int sizeOfHeap;
  int *baseHeapAddress;
  int granularity;
  int time;
  enum State{ Running, Runnable, Zombie ... };
  /* ...few more fields here... */
  long *fileHandles;
  long fileHandlesCount;
}proc;

Imagine that fileHandles is an pointer to an array of integers which each integer contains the location (perhaps in encoded format) for the offset into the OS's table of where the files are stored on disk.

Now imagine how much memory that would eat up and could slow down the whole kernel, maybe bring about instability as the 'multi-tasking' concept of the system would fall over as a result of having to keep track of how much filehandles are in use and to provide a mechanism to dynamically increase/decrease the pointer to integers which could have a knock on effect in slowing down the user program if the OS was dishing out file handles on a user program's demand basis.

I hope this helps you to understand why it is not implemented nor practical.

Hope this makes sense, Best regards, Tom.


Closing a file also flushes the writes to disk -- well, from the point of view of your application anyway. After closing a file, the application can crash, as long as the system itself doesn't crash the changes will not be lost. So it's not a good idea to let the GC close files at its leisure. Even if it may be technically possible nowadays.

Also, to tell the truth, old habits die hard. File handles used to be expensive and are still probably considered as such for historical reasons.


It's not just the amount of file handles, it's that sometimes when they're used in some modes, they can prevent other callers from being able to access the same file.


I'm sure more comprehensive answers will ensue, but based on my limited experience and understanding of the underlying operation of Windows, file handles (the structures used to represent them to the OS) are kernel objects and as such they require a certain type of memory to be available - not to mention processing on the kernel part to maintain consistency and coherence with multiple processes requiring access to the same resources (i.e. files)


I don't think they're necessarily expensive - if your application only holds a few unnessary ones open it won't kill the system. Just like if you leak only a few strings in C++ no one will notice, unless they're looking pretty carefully. Where it becomes a problem is:

  • if you leak hundreds or thousands
  • if having the file open prevents other operations from occurring on that file (other applications might not be able to open or delete the file)
  • it's a sign of sloppiness - if your program can't keep track of what it owns and is using or has stopped using, what other problems will the program have? Sometimes a small leak turns into a big leak when something small changes or a user does something a little differently than before.


In the Linux paradigm sockets are file descriptors. There are definite advantages to freeing up TCP ports as soon as possible.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜