Source for Names to use in web scraping
Can anyone suggest a good source of names that I can use to help analyze some tables on web pages.
The first column of the tables I am scraping have names alone, names and开发者_StackOverflow社区 titles or just titles.
The names can be as varied as John Smith to Vikram Saksena.
I have been poking around for a compiled list of words that can be found in proper names.Edited I have tried the name set from the Census and it has so much garbage in it that its not worth working with.
Download the Febrl project source code.
It's data folder contains tables for names (given/middle/surnames/etc). You may have to massage the data for your own needs.
For surnames you can check around for U.S. Census data. I don't have the link right now, but know I've used the common U.S. surnames from that source before.
精彩评论