开发者

Versioning file system with Amazon S3 as backend

I'm trying to make the following work on my Debian computers and one OS X Laptop.

What I would like to have is some kind of versioning file system that uses Amazon S3 as a backend.

What I was thinking is to use s3fs (using FUSE) to mount the bucket, then make a filesystem that uses GIT that makes a new commit everytime I write the file (I would like a complete version history up to x days). The mounted folder should then show the latest version of the files. One of the problems which I don't know how to solve (due to a lack of experience, I assume) is that I would like to synchronise the files with a local folder. Of course, I could just download all the files but that is not bandwidth friendly.开发者_StackOverflow社区

Another problem is that the current version of s3fs does not seem to work with MacFUSE.

Further, something that will probably not happen but I would like to prevent the files from becoming corrupt if two computers write to the file at the same time. If I have understood correctly, git implements some kind of file locking itself and does not depend on the file locking of the operating system.

What could be an outline to make this work? The files which I would like to store these way are just .tex-files and vector images.

I know that there are solutions in existence (like dropbox) but I don't really like that it is closed source.


First, let me say that I would not recommend blindly running git on s3. git produces a lot of small files during its operation; S3 is expensive (and slow) when dealing with a large number of very small objects. As you surmise, S3 also has no mechanism locking; eventual consistency makes this impossible. And finally, git depends on fast random access to its objects database; S3 cannot provide this, so you'll need a local mirror of the entire repository in any case.

Instead, I would recommend that you extend the existing git http backend to push to S3. Instead of pushing loose files, this would push a single pack file. This would leverage what S3 is good at - a bulk load of large objects. You'd still have no locking, but since you decide when to push manually, you can find some other way to coordinate things easily enough.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜