开发者

Source for Names to use in web scraping

Can anyone suggest a good source of names that I can use to help analyze some tables on web pages.

The first column of the tables I am scraping have names alone, names and开发者_StackOverflow社区 titles or just titles.

The names can be as varied as John Smith to Vikram Saksena.

I have been poking around for a compiled list of words that can be found in proper names.

Edited I have tried the name set from the Census and it has so much garbage in it that its not worth working with.


Download the Febrl project source code.

It's data folder contains tables for names (given/middle/surnames/etc). You may have to massage the data for your own needs.

For surnames you can check around for U.S. Census data. I don't have the link right now, but know I've used the common U.S. surnames from that source before.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜