Use one search string to search 4 website catalogs
I frequent many libraries. The Brooklyn Public Libraries, Queens Public Libraries, New York Public Libraries and CUNY schools libraries. When I want a book I have to go to all 4 online catalogs and search for it. I want to instead write a program that takes the book, author, ISBN, or whatever keywords as a string and then return 4 sear开发者_开发百科ch result as if I went to each catalog site manually. I think this would be considered a web crawler. I'm fairly good at following programming tutorials, and googling something when I know what I'm looking for. I really have no idea where to start and would appreciate some advice. Thanks in advanced.
Here are some python based scripts and examples of how you can automate the crawling/scraping of each online catalog. This can be done in any language, but python in my opinion would be the simplest.
Simple Web Crawler (Python recipe)
Scrapy
Or, to do it without a prewritten script you would use urllib2 to get the web page source and then parse that source with something like BeautifulSoup. And with the parsed source, do some keyword checks, and display the results.
精彩评论