checksum remote file
Is there a way to get a program I can run via the command line that would do a checksum of a remote file? For instance get a checksum of https://stackoverflow.com/opensearch.xml
I want to be able get an update of when a new rss/xml entry is available. I was thinking I could do a checksum of a file every once in a while and if it is different then there must be an update. I'm开发者_JAVA百科 looking to write a shell script that checks new rss/xml data.
A quick way to do this with curl is to pipe the output to sha1sum as follows:
curl -s http://stackoverflow.com/opensearch.xml|sha1sum
In order to make a checksum on the file, you'll have to download it first. Instead of this, use If-Modified-Since in your request headers, and server will respond with 304 not modified header and without content, if the file is not changed, or with the content of the file, if it was changed. You may be interested also in checking for ETag support on the server.
If downloading the file is not a problem, you can use md5_file to get md5 checksum of the file
curl
curl has an '-z' option:
-z/--time-cond <date expression>|<file>
(HTTP/FTP) Request a file that has been modified later
than the given time and date, or one that has been modified before
that time. The <date expression> can be all sorts of date strings
or if it doesn't match any internal ones, it is taken as a filename
and tries to get the modification date (mtime) from <file> instead.
See the curl_getdate(3) man pages for date expression details.
So what you can do is:
$ curl http://stackoverflow.com/opensearch.xml -z opensearch.xml -o opensearch.xml
This will do actual download if remote file is younger than the local one (local file may absent - in this case it will be downloaded). Which seems to be exactly what you need...
wget
wget also has an option to track timestamps - -N
When running Wget with -N, with or without -r or -p, the decision as to whether
or not to download a newer copy of a file depends on the local and remote
timestamp and size of the file.
-N, --timestamping Turn on time-stamping.
So in case wget one can use:
$ wget -N http://stackoverflow.com/opensearch.xml
You can try this under your bash:
wget <http://your file link>
md5sum <your file name>
You should first examine the HTTP headers to see if the server itself is willing to tell you when the file is from; it's considered bad form to fetch the entire file if you don't need to.
Otherwise, you'll need to use something like wget or curl to fetch the file, so I really hope you don't plan to be working with anything large.
精彩评论