开发者

python: complexity of os.path.exists with a ext4 filesystem?

Does开发者_运维知识库 anyone know what the complexity of the os.path.exists function is in python with a ext4 filesystem?


Underlying directory structure used by Ext4 (and Ext3) is exactly the same as in Ext2. Ext3 adds journaling, Ext4 improves that journaling. Journaling is irrelevant to your question.

Originally Ext2 used to store that as list, but that of course was inefficient for large directories. So it's has been changed to tweaked version of B-tree called HTree. Unlike standard B-tree, HTree has constant depth and uses hash-map per node, thus it's lookup complexity is O(1).

Ext2's scheme, which we dubbed "HTree", uses 32-bit hashes for keys, where each hash key references a range of entries stored in a leaf block. Since internal nodes are only 8 bytes, HTrees have a very high fanout factor (over 500 blocks can be referenced using a 4K index block), two levels of index nodes are sufficient to support over 16 million 52-character filenames. To further simplify the implementation, HTrees are constant depth (either one or two levels). The combination of the high fanout factor and the use of a hash of the filename, plus a filesystem-specific secret to serve as the search key for the HTree, avoids the need for the implementation to do balancing operations.

See: http://ext2.sourceforge.net/2005-ols/paper-html/node3.html


Chances are good that the complexity is O(n) with n being the depth in the filesystem (e.g. / would have n=1, /something n=2, ...)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜