how to get files as they are added to a remote server
I am using a bash script (below) on a remote server (so far using ssh to connect) to execute a python script that downloads a lot of pdf files one at a time (getting the download locations from a text file with the URL's) in a loop.
I would like to move the files from the remote server to my local computer as they are downloaded, and then delete the file from the remote server. Is there a way that I can expand my bash script to do this? Or are there alternatives for completing this task?
while read line; do pyth开发者_StackOverflowon python_script.py -l $line; done < pdfURLs.txt
[Edited to reflect the fact that the original poster can't scp into his local computer from the server; I assume it's behind NAT or something of the sort]
[Edit 2: I'm keeping the current tunnel-based answer, for reference; but, since the original poster is unable to ssh back into his local machine, I'll assume something else is blocking the tunnel. See the suggestion at the end].
Ok, you'll need to open up a tunnel between the server and your home computer. So, ssh from your local computer (I assume it's Unix-based, you mentioned is a Mac, so that's fine) into the server with this command:
ssh -R 10022:localhost:22 your_server_address
In brief, this will forward the server's port 10022 (it's a high (> 1024) port, so it's likely to be available) to your local computer's port 22 (which is where ssh usually listens). That is, once you've done that, if you ssh into the server's 10022 port, you're actually sshing into your local computer. If you want to test it, from the server, do:
ssh -p 10022 localhost
login with your local computer's username and password, and you should see its shell prompt. If you do this test, remeber to log out, so as not to confuse yourself.
Once you've opened the tunnel, keep that connection open. You may use it to run the bash command line that downloads the PDF etc, but that's not necessary.
Then, try the following command-line:
while read line; do python python_script.py -l "$line"; scp -P 10022 *.pdf localhost:path/to/put/files/; rm *.pdf; done < pdfURLs.txt
A few things to keep in mind:
- This waits until scp has finished and only then will the python script downloaded the next PDF. You mentioned you effectively wanted this, not to keep the PDF files on the server for long.
- This copies all PDF files from the current directory to your local computer (and then erases them), so preferably run this from a previously empty directory.
- I assume you can scp without having to type a password (using shared key authentication, for instance), otherwise it might get a bit annoying, having to retype your password all the time.
That should do it.
[Edited to add this alternative, for when the tunnel doesn't work]
If that fails, I can only assume something else is blocking your ssh/scp from the server to your local machine. In that case, you may try something different: from you local machine, do
while read line; do ssh -n server_address "cd tmp_download_directory && rm -f *.pdf && python python_script.py -l $line" && scp server_address:tmp_download_directory/*.pdf /local/path/to/put/files/; done < pdfURLs.txt; ssh server_address "rm -f tmp_download_directory/*.pdf"
(The "-n" switch to ssh is necessary, not to feed subsequente $lines into the ssh shell.)
精彩评论