Is there an effort to develop build-oriented file system with automatic change detection of files?

2023-01-30 07:22 问答作者：

I recently started to use Git. One of the interesting features I discovered was the use of hashes to quickly detect changes.

On the other hand, I see that build tools (like make, ant, javac, etc.) tries to detect changes in source files by checking file's timestamp.

The problems in this approach are:

If you work on more than one machine, you have to make sure all clocks are in sync, otherwise, a new file may be considered unchanged because the other machine's clock gave it timestamp of the past relative to the build machine.
On a big project, you have to scan all files' timestamp in order to detect a change.

I wonder if someone has already taken the Git approach in order to deal with these issues:

Each file has a unique hash, depending on its content, not timestamp.
Each directory also has its hash, depending on the files in the directory and their hashes.
Even the simple change deeply inside the source tree causes the root directory to have a different hash due to the above rules

Such a mechanism could help making build tools much faster, because detecting a change in source tree is a simple operation of hash comparison.开发者_如何学Python If the hash of source tree root directory has changed, it means that a change occurred deeper in the source tree, so continue to scan the tree recursively for changes - exactly as Git does to detect changes.

It doesn't necessarily mean that this source tree has to be managed by Git. My idea is that the file system would automatically provide file's hash code as one of its attributes / metadata, so the build tool could rely on this instead of on timestamp. And in addition, directory hash would automatically reflect the state of the file in it.

I already read a little bit about Sun's ZFS, but I am not sure it's a complete solution to make builds faster.

What do you think about this idea? Is there already such file system? Is there already such build tool?

I'll argue that what you're trying to solve is actually a non-issue:

The clock skew problem can be mostly avoided by using NTP.

Certainly it'd be nice to have clock skew issues eliminated entirely, but we can probably agree that throwing a fairly complex content-tracking system at the problem is overkill.

Regarding performance, scanning the entire tree tends to not be a problem in practice. stat is ridiculously fast (so long as you're not on Windows) -- ls -lR > /dev/null over the entire Linux kernel tree (38k files) takes 350 ms on my system.

In fact, if stat'ing all your files is a problem, then your version control system will become slow, and that will be a much bigger problem than your build performance. Every git status or git diff, for instance, stats all files in your working copy to check their mtimes, so you'd better hope that's fast.

So if you're looking to speed up make, don't look at the file system; it's most likely insignificant compared to whatever is actually eating up your build time.

Hope that eases your mind!

继续阅读：build checksum filesystems git hash

Is there an effort to develop build-oriented file system with automatic change detection of files?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？