Crawling itunes.apple.com

2023-01-09 21:11 问答作者：

I am trying to crawl the apple itunes websi开发者_如何学JAVAte. I am getting output in binary format. For example

curl -A "mozilla/5.0" http://itunes.apple.com/us/app/the-far-islands-by-john-buchan/id327765949?mt=8

returns binary.

Can anybody please tell me what i am missing?

Thanks

You're getting binary back because the page you cited isn't returning HTML/XML, it's returning an Apple WebObject. From wget:

wget http://itunes.apple.com/us/app/the-far-islands-by-john-buchan/id327765949?mt=8
--2010-08-03 12:38:14--  http://itunes.apple.com/us/app/the-far-islands-by-john-buchan/id327765949?mt=8
Resolving itunes.apple.com... 17.250.237.16
Connecting to itunes.apple.com|17.250.237.16|:80... connected.
HTTP request sent, awaiting response... 200 Apple WebObjects
Length: 22900 (22K) [text/html]
Saving to: `id327765949?mt=8'

100%[======================================>] 22,900      --.-K/s   in 0.05s   

2010-08-03 12:38:14 (440 KB/s) - `id327765949?mt=8' saved [22900/22900]

See the good old Wikipedia for more info, but if you want to crawl it, you may need to use something that simulates a browser and thus can interpret it - maybe watir would work.

继续阅读：app-store curl

Crawling itunes.apple.com

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？