开发者

Backup of folder + database - Python

I feel like this is quite delicate,

I have various folders whith projects I would like to backup into a zip/tar file, but w开发者_如何学Goould like to avoid backing up files such as pyc files and temporary files.

I also have a Postgres db I need to backup.


Any tips for running this operation as a python script?

Also, would there be anyway to stop the process from hogging resources in the process?


Help would be very much appreciated.


If you're on Linux (or any other form of Unix, such as MacOSX), a simple way to reduce a process's priority -- and therefore, indirectly, its consumption of CPU if other processes want some -- is the nice command. In Python (same OSs), os.nice lets your program "make itself nicer" (reduce priority &c).

For backing up a PostgreSQL DB, I recommend PostgreSQL's own tools; for zipping up a folder except the pyc files (and temporary files -- however it is you identify those), Python is quite suitable. For example:

>>> os.chdir('/tmp/az')
>>> f = open('/tmp/a.zip', 'wb')
>>> z = zipfile.ZipFile(f, 'w')
>>> for root, dirs, files in os.walk('.'):
...   for fn in files:
...     if fn.endswith('.pyc'): continue
...     fp = os.path.join(root, fn)
...     z.write(fp)
... 
>>> z.close()
>>> f.close()
>>> 

this zips all files in said subtree except those ending in .pyc (without compression -- if you want compression, add a third argument zipfile.ZIP_DEFLATED to the zipfile.ZipFile call). Could hardly be easier.


On linux, you can use tar with --exclude option. an example, to exclude your .pyc files and temp files (in this example, .tmp)

$ tar zcvf backup.tar.gz --exclude "*.tmp" --exclude "*.pyc"

use the z option to zip it up as well.


With today's multicore cpus, you may find that cpu is not the bottle neck. It is now far more likely to the the disk I/O that needs to be shared better.

Linux has the ionice command to allow you to control this

ionice(1)

NAME

   ionice - get/set program io scheduling class and priority

SYNOPSIS

   ionice [[-c class] [-n classdata ] [-t]] -p PID [PID ...]

   ionice [-c class] [-n classdata ] [-t] COMMAND [ARG ...]

DESCRIPTION
This program sets or gets the io scheduling class and priority for a program. If no arguments or just -p is given, ionice will query the current io scheduling class and priority for that process.


Backup is at least as much about the importance of recovery using whatever backup you make.

The right way to back up source code is to keep source files in a VCS (version control system), and back up the VCS repository. Exclude any auto-generated easily-replaced files (like those *.pyc files, etc.) from the VCS repository. I recommend Bazaar for very efficient storage and user-friendliness, but your team will likely already have a VCS they prefer.

For backup of a PostgreSQL database, it's best to use pg_dump to regularly dump the database to a text file, compress that, and back up the result. This is because the backup then becomes restorable on any machine, by re-playing the database dump into another PostgreSQL server.

As for how to automate it: you would be best using a Bash program for the purpose, since it's just a matter of connecting some commands to files, which is what the shell excels at.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜