开发者

python: copy only missing files from FTP dirs and sub-dirs to local dirs and sub-dirs

the problem is:

I have a local directory '/local' and a remote FTP directory '/remote' full of subdirectories and files. I want to check if there are any new files in the sub-directories of '/remote'. If there are any, then copy them over to '/local'.

the question is:

am I using the right strategy? Is this totally overkill and is there a much faster pythonic way to do it? DISCLAIMER: I'm a python n00b trying to learn. So be gentle ... =) This is what I've tried:

Create a list of all files in /local开发者_开发技巧 and its sub-dirs.

LocalFiles=[]
for path, subdirs, files in os.walk(localdir): 
    for name in files:                     
        LocalFiles.append(name)

Do some ftplib magic, using ftpwalk() and copying its results to a list of the form:

 RemoteFiles=[['/remote/dir1/','/remote/dir1/','/remote/dir3/'],['file1.txt','file12.py','file3.zip']]

so I have the directory corresponding to each file. Then see which files are missing by comparing the lists of filenames,

missing_files= list(set(RemoteFiles[1]) - set(LocalFiles))  

and once I've found their name, I try to find the directory that came with that name,

for i in range(0,len(missing_files)):
    theindex=RemoteFiles[1].index(missing_files[i])

which lets me build the list of missing files and their directories,

MissingDirNFiles.append([remotefiles[0][theindex],remotefiles[1][theindex]])

so I can copy them over with ftp.retrbinary. Is this a reasonable strategy? Any tips, comments and advice is appreciated [especially for large numbers of files].


If you get the modification time of both the local and the remote FTP directories and store it in a data base, you could prune the search for new or modified files. This should speed up the sync procedure significantly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜