Downloading a ftp link though a script that changes the address to http
I wrote a script in tcl to grab links out of the download portion of a huge document checking for http:// and ftp:// as links to download. All of the ftp:// li开发者_高级运维nks don't require password/username and instead of handling them in separate cases (passing ftp:// to one download method and http:// to another download method) I would just pass all links to one method and substitute ftp:// with http://.
e.g. if I have ftp://server.com/dir/big_file.zip I would pass that along as http://server.com/dir/big_file.zip and download it as that.
I haven't run into any problems testing this with a small sample (testing takes forever because of file sizes) and before I run this overnight to download everything I want to know if is there any possible dangers that will come up? I only need to download, don't need to upload and I'm sure all the ftp links don't have user/passwd.
Also, I know this is probably bad practice but what exactly is the difference between having ftp:// and http:// for a file link when there's no username/password?
If they are all from the same server, it wouldn't pose any authentication problems (if it worked for some, it should work for all of them). FTP and HTTP operate on different ports, so using one over the other would have you downloading the file over a different port. Sometimes FTP can be faster than HTTP (since it's meant for file transfer), so it might be better to use FTP if you can.
Bear in mind that it's entirely possible for a server to make a file accessible via FTP without doing so for HTTP. I'd go so far as to say it's fairly common for that to be the case. That being said, if the server you're hitting does serve all files up in both protocols, then you should be fine.
In the case where some files may not be available via HTTP, one thing you might try is to have the original (FTP) url used as a fallback if the HTTP one fails.
精彩评论