what is the relation betwenn file pointer width and maximum file size

2023-02-09 10:26 问答作者：

Just curious about the maximum file size limit provided by some popular file systems on Linux, I have seen some are up to TB scale.

My question is what 开发者_Python百科if the file pointer is 32 bits wide, like most Linux we meet today, doesn't that mean that the maximum distance we can address is 2^32-1 bytes? Then how can we store a file larger than 4GB?

Furthermore, even if we can store such a file, how can we locate a position beyond the 2^32 range?

To use files larger than 4 GB, you need "large file support" (LFS) on Linux. One of the changes LFS introduced was that file offsets are 64bit numbers. This is independent of whether Linux itself is running in 32 or 64bit mode (e.g. x86 vs. x86-64). See e.g. http://www.suse.de/~aj/linux_lfs.html

LFS was introduced mostly in glibc 2.2 and kernel 2.4.0 (roughly in 2000-2001), so any recent Linux distribution will have it.

To use it on Linux, you can either use special functions (e.g. lseek64 instead of lseek), or set #define _FILE_OFFSET_BITS 64, then the regular functions will use 64bit offsets.

In Linux, at least, it's trivial to write programs to work with larger files explicitly (i.e., not just using a streaming approach as suggested by kohlehydrat).

See this page, for instance. The trick usually comes down to having a magic #define before including some of the system headers, which "turn on" the "large file support". This typically doubles the size of the file offset type to 64 bits, which is quite a lot.

There is no relation whatsoever. The FILE * pointer from C stdio is an opaque handle that has no relation to the size of the on-disk file, and the memory it points too can be much bigger than the pointer itself. The function fseek(), to reposition where we read from and write to, already takes a long, and fgetpos() and fsetpos() use an opaque fpos_t.

What can make working with large files difficult is off_t used as an offset in various system calls. Fortunately, people realized this would be an issue, and came up with "Large File Support" (LFS), which is an altered ABI with a wider width for the offset type off_t. (Typically this is done by introducing a new API, and #defineing the old names to invoke this new API.)

You can use lseek64 to handle big files. Ext4 can handle 16 TiB files.

Just call repeatedly read(int fd, void *buf, size_t count);

(So there's no need for a 'pointer' into the file.)

From the filesystem-design-point-of-view, you're basically having an index tree (Inodes), which points to several pieces of that data (blocks), that form the actual file. Using this model, you can theoretically have infinte sizes of files.

UNIX has actual physical limits to file size determined by the number of bytes a 32 bit file pointer can index, about 2.4 GB.

consider closing the first file just before it reaches 0x7fffffff bytes in length, and opening an additional new file.

The reason for some limits of the ext2-file system are the file format of the data and the operating system's kernel. Mostly these factors will be determined once when the file system is built. They depend on the block size and the ratio of the number of blocks and inodes. In Linux the block size is limited by the architecture page size.

There are also some userspace programs that can't handle files larger than 2 GB.

The maximum file size is limited to min( (b/4)3+(b/4)2+b/4+12, 232*b ) due to the i_block (an array of EXT2_N_BLOCKS) and i_blocks( 32-bits integer value ) representing the amount of b-bytes "blocks" in the file.

继续阅读：32-bit file filesystems size

what is the relation betwenn file pointer width and maximum file size

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？