What's so special about 4kb for a buffer length?

2023-03-17 12:52 问答作者：

When performing file IO in .NET, it seems that 95% of the examples that I see use a 4096 byte buffer. What's so special about 4kb for a buffer length? Or is it just a convention like us开发者_开发问答ing i for the index in a for loop?

That is because 4K is the default cluster size for for disks upto 16TB. So when picking a buffer size it makes sense to allocate the buffer in multiples of the cluster size.

A cluster is the smallest unit of allocation for a file, so if a file contains only 1 byte it will consume 4K of physical disk space. And a file of 5K will result in a 8K allocation.

Update: Added a code sample for getting the cluster size of a drive

using System;
using System.Runtime.InteropServices;

class Program
{
  [DllImport("kernel32", SetLastError=true)]
  [return: MarshalAs(UnmanagedType.Bool)]
  static extern bool GetDiskFreeSpace(
    string rootPathName,
    out int sectorsPerCluster,
    out int bytesPerSector,
    out int numberOfFreeClusters,
    out int totalNumberOfClusters);

  static void Main(string[] args)
  {
    int sectorsPerCluster;
    int bytesPerSector;
    int numberOfFreeClusters;
    int totalNumberOfClusters;

    if (GetDiskFreeSpace("C:\\", 
          out sectorsPerCluster, 
          out bytesPerSector, 
          out numberOfFreeClusters, 
          out totalNumberOfClusters))
    {        
      Console.WriteLine("Cluster size = {0} bytes", 
        sectorsPerCluster * bytesPerSector);
    }
    else
    {
      Console.WriteLine("GetDiskFreeSpace Failed: {0:x}", 
        Marshal.GetLastWin32Error());
    }

    Console.ReadKey();
  }
}

A few factors:

More often than not, 4K is the cluster size on a disk drive
4K is the most common page size on Windows, so the OS can memory map files in 4K chunks
A 4K page can often be transferred from drive to OS to User Process without being copied
Windows caches files in RAM using 4K buffers.

Most importantly over the years a lot of people have used 4K as their buffer lengths due to the above, therefore a lot of IO and OS code is optimised for 4K buffers!

My guess would be that it is related to the OS file block size --- Windows on .NET.

My guess … my answer is right, and others are not - not going deep enough in history. And knowing that it is an old question, its much more important to mention, that there where times, when performance was not a Question of programming style only.

The binary size (4096, 8192 or sometime 1024) comes from times, when you could see the the connections of the CPU to peripheral chips. Sorry for sounding old, but this is essential to answer your question. The buffer in your program had to get shifted out to a peripheral device, and therefore it need address lines (needed today there are other ideas) and this address lines are binary bounded. And the chip getting the information needed (and needs) memory to keep it. This memory was an is (!) determined by binary addresses ... - you won't find a 23gb chip. and 1k, 2k, 4k or (finaly) 8k was a good value (in old days).

How ever to shift out a buffer of 8k needed (some how) the same time as shift out one Byte. Thats why we have buffers!

That hard disks have this (cluster) size is not the reason for the buffer size - the oposite is true - the organisation of hard discs follows the above system.

继续阅读：.net buffer io

What's so special about 4kb for a buffer length?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？