开发者

Revision Control script with python

I am was reading this SO question https://stackoverflow.com/questions/38388/organization-wide-backup-strategy

And i was interested in implementing this system

Revision control is handled by a Python script that scans a file system and uploads changed files to a central server. The file system on that server makes extensive use of Unix-style symbolic links - i.e. only one copy of a given file is ever stored, subsequent copies are simply symlinked to. This allows you to have a full file system created for each day's backup but to only use a fraction of the actual disk space it uses (you just need enough space for any files changed since the last backup and to store all those symlinks). This is the general principle that things like the Mac's Time Machine system uses. Users needing to restore an old f开发者_运维百科ile can simply browse that file system.

Can anyone give me some guide lines or tutorials or some rready to use script like that in Python . the best thing i would like to see is how to have virtual file system of every day , without using too much space

I am newbie in python


rsync + hard links.

use rsync to maintain a master backup directory. use cp -al to take daily/weekly/whatever snapshots.

Using hard links means duplicated files will not take up additional space, i.e. each snapshot will only use up the space in the files that have changed. In addition each snapshot contains a complete copy of the backup and symlinks that get backed up will be preserved.

I have a python wrapper for this that manages the snapshots - keeping a predefined number of daily, weekly, monthly and yearly snapshots, but you can keep this as elaborate or simple as you want.

Great source of inspiration here: http://www.mikerubel.org/computers/rsync_snapshots/

Update:

sample backup:

# generate some test data
mkdir /tmp/backup
mkdir documents
date > documents/file1
date > documents/file2

# do the backup
rsync -av --delete documents /tmp/backup/
cp -al /tmp/backup /tmp/backup.$(date +%Y%m%d-%H%M%S)

Now you should have a master backup with the current state (/tmp/backup) and a dated backup /tmp/backup.. For the next backup, just run the rsync and cp again:

# modify the test data
date >> documents/file1

# do the backup
rsync -av --delete documents /tmp/backup/
cp -al /tmp/backup /tmp/backup.$(date +%Y%m%d-%H%M%S)

Note that rsync will only update files that have changed, so in terms of backup time this is optimised. As you're using hard links for files that are unchanged, it is also very efficient for storage too.


Just use Mercurial - a great source control system written in Python. It's also one of teh most popular SCMs these days.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜