First,real programming project:Create a Database from a web dictionary
I want to create a programme that will take a word list, search on a web dictionary (like webster learning dictionary,cambridge learner's,..) and will create a text file in this shape:
word1 pronunciation definition example sentence ... ... word2 pronunciation definition example sentence ... ... ....
and I have a few questions:
is it possible to do this?
if it is what tools should I use? If It is possible with python,what lib. should I use? (I prefer python cause it is language I am learning)I just need a general idea of the way to take.
I'm still a big noob with programming but, I think that If I work o开发者_StackOverflow中文版n some personal project, I'll make good progress.
P.S.: My english is far from perfect,sorry about it.
It would not be that difficult, the main thing would be figuring out how to query the website. These would be the basic steps:
- map query string to url:
- You need to figure out how the website works (Examine the source ot the html to figure paramters of the forms.) Some websites have public API's that make it easier.
- get web page:
urllib2
- parse page for your answer:
BeautifulSoup
. Separate your info from the rest of web page. - write info to a file
This is possible, but in order to maintain scalability you will need the rigth algorithm: http://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_matching_algorithm
In python this is: http://pypi.python.org/pypi/ahocorasick/0.9
Just capture the event where the search tree reaches a state in which a search word is discovered and act upon it. The aforementioned wiki page points you to some useful resources.
Greetz, J.
精彩评论