开发者

I'm looking for an english language word list

Does anyone know where I could locate an english language word list in the form of a SQL dump?

I found a word list online but it's a large plain text file; the words are delimited by a new line character. I tried writing a PHP script to loop through the words and insert them in to the da开发者_StackOverflow中文版tabase but quickly ran in to memory issues just reading the large file. I've split the file in to 4 smaller files but I'm still getting memory errors. If any one knows how to convert my current file in to a more import friend format, please let me know.


Use LOAD DATA INFILE. From the docs:

The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.

Something like this should work:

LOAD DATA INFILE 'your/path/your_file.txt' INTO TABLE your_table (your_column_name);


http://corpora.uni-leipzig.de/download.html

A couple of corpora in different languages (including english) ...


Your approach should work fine, you just need to change the way you're reading the file. I'm guessing you're using file_get_contents or something similar to read the whole file in, when you could do it line by line and avoid the memory issues. Try something like fscanf():

$handle = fopen("yourfile.txt", "r");
while ($info= fscanf($handle, "%s\t%s\t%s\n")) {
    list ($field1, $field2, $field3) = $info;
    //... do something with the values
}

fclose($handle);


If you're open to using some python in the mix, here's a good how to article:

Ways to process and use Wikipedia dumps

(pulling Wikipedia data (there's your english text) and pushing into a MySQL database)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜