Bizarre download of webpage using wget and curl
I'm trying to download some remote pages. In the source code there is a very long line. Both curl and wget download the file but decide to miss out this one line. Is there another command line utility I can use and/or does anyone know how I can fix this problem.
Edit: Can I clarify, I have tried with wget and curl and both files miss the line.
Edit:
[x@x scripts]$ curl --version
curl 7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5
Protocols: tftp ftp telnet dict ldap http file https ftps
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz
[x@x scripts]$ wget --version
GNU Wget 1.11.4 Red Hat modified
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Originally written by Hrvoje Niksic <hniksic@xemacs.org>.
Cu开发者_如何转开发rrently maintained by Micah Cowan <micah@cowan.name>.
There are two probable explanations for what's happening:
- The server looks at the user agent and decides not to include this line. This is the less likely of the two, but wget allows you to change the user agent string, so you should be able to work around it easily.
- The long line is constructed on the client, using JavaScript. This is far more likely, but unfortunately for you, not easy to replicate in a command-line environment.
To verify, use a tool such as Fiddler to look at what's actually coming over the wire.
Write version of wget/curl. What is length of that line?
Why not use curl OR wget ? Both are great tools for that !
精彩评论