I'm looking for an english language word list
Does anyone know where I could locate an english language word list in the form of a SQL dump?
I found a word list online but it's a large plain text file; the words are delimited by a new line character. I tried writing a PHP script to loop through the words and insert them in to the da开发者_StackOverflow中文版tabase but quickly ran in to memory issues just reading the large file. I've split the file in to 4 smaller files but I'm still getting memory errors. If any one knows how to convert my current file in to a more import friend format, please let me know.
Use LOAD DATA INFILE. From the docs:
The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.
Something like this should work:
LOAD DATA INFILE 'your/path/your_file.txt' INTO TABLE your_table (your_column_name);
http://corpora.uni-leipzig.de/download.html
A couple of corpora in different languages (including english) ...
Your approach should work fine, you just need to change the way you're reading the file. I'm guessing you're using file_get_contents
or something similar to read the whole file in, when you could do it line by line and avoid the memory issues. Try something like fscanf():
$handle = fopen("yourfile.txt", "r");
while ($info= fscanf($handle, "%s\t%s\t%s\n")) {
list ($field1, $field2, $field3) = $info;
//... do something with the values
}
fclose($handle);
If you're open to using some python in the mix, here's a good how to article:
Ways to process and use Wikipedia dumps
(pulling Wikipedia data (there's your english text) and pushing into a MySQL database)
精彩评论