Looking for an Open Source Web Crawler that can crawl API requests and parse XML into csv [closed]

2023-02-06 13:27 问答作者：

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 8 years ago.

Improve this question

I'm looking into webcrawlers to crawl through an API and parse the xml into an xml or csv file.

I've been playing around with requests from some API feeds but it would be great if I didnt have to do it manually and use something to do it automatically and edit the data later.

For example using the API for a site called eventful, I can request an "?xml feed?" of data

http://api.eventful.com/rest/events/search?app_key=LksBn开发者_C百科C8MgTjD4Wc5&location=pittsburgh&date=Future

If you inspect the link you can see there is a ton of xml data sent back.

I thought that since the xml data is already broken down by elements it wouldn't be as difficult to ask the crawler to handle the sorting (e.g the city element would send all data to a city field in the csv document)

I'm wondering if anyone has used an existing opensource web crawler to crawl APIs and relate that parsed data into a excel like format....

I looked into Nutch but i couldnt find any reference in the documentation to sorting an xml return into a excel like document based on the elements returned by the API feed.

Has anyone done anything like this before and can you refer a program. Specifics would be really helpful.

We at http://import.io/ have a free solution similar to mozenda, you build the API using our web browser and then you can upload the API to our servers and use it for free. We also offer a crawler and various other features. Check it out and see what you think :)

P.S I work for import.io if you didn't get that already.

I found a paid solution called Mozenda.....

I'll update if I can find something opensource

继续阅读：open-source web-crawler web-scraping xml-parsing

Looking for an Open Source Web Crawler that can crawl API requests and parse XML into csv [closed]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？