开发者

Database of common name aliases / nicknames of people

I'm involved with a SQL / .NET project that will be searching through a list of names. I'm looking for a way to return some results on similar first names of people. If searching for "Tom" the results would include Thom, Thomas, etc. It is not important whether this be a file or a web service. Example Design:

开发者_运维技巧
Table "Names" has Name and NameID
Table "Nicknames" has Nickname, NicknameID and NameID

Example output:

You searched for "John Smith"
You show results Jon Smith, Jonathan Smith, Johnny Smith, ...

Are there any databases out there (public or paid) suited to this type of task to populate a relationship between nicknames and names?


I'm adding another source for anyone who comes across this question via Google. This project provides a very good lookup for this purpose.

https://github.com/carltonnorthern/nickname-and-diminutive-names-lookup

It's somewhat simpler and less complete than pdNickName but on the other hand it's free and easy to use.


A google search on "Database of Nicknames" turned up pdNickName (for pay).

In addition, I think you only need a single table for this job, not two, with NameID, Name, and MasterNameID. All the nicknames go into the Name column. One name is considered the "canonical" one. All the nickname records use the MasterNameID column to point back to that record, with the canonical name pointing to itself.

Your two table schema contains no additional information and, depending on how you fill in the nickname table, you might need extra code to handle the canonical cases.


I just found this site.

It looks like you could script it pretty easily.

http://www.behindthename.com/php/extra.php?terms=steve&extra=r&gender=m

I just wish I could auto narrow this to english..


Another commercial name matching database is: http://www.basistech.com/name-indexer/

It looks quite professional (though potentially expensive).

They claim to support the following languages:
Arabic, Chinese (Simplified), Chinese (Traditional), Persian (Farsi / Dari), English, Japanese, Korean, Pashto, Russian, Urdu


Here is a github repo with csv of related names, and you can contribute back:

The first few lines show the format:

aaron,ron
abel,abe
abednego,bedney
abijah,ab,bige
abigail,ab,abbie,abby,gail
abner,ab,abbie,abby
abraham,abe,abram,bram
absalom,ab,abbie,app


There is a database out there called pdNicknames (found at http://www.peacockdata2.com/products/pdnickname/). It contains everything you need, at a cost of $500.


Similar format as Stan James's csv, but folded two ways for lookups: Name to nickname: https://github.com/MrCsabaToth/SOEMPI/blob/master/openempi/conf/name_to_nick.csv Nickname to name: https://github.com/MrCsabaToth/SOEMPI/blob/master/openempi/conf/nick_to_name.csv


To select similar sounding name use: (see MSDN)

SELECT SOUNDEX ('Tom')


This is a good choice: https://github.com/onyxrev/common_nickname_csv

id, name, nickname
1, Aaron, Erin
2, Aaron, Ron
3, Aaron, Ronnie
4, Abel, Ab
5, Abel, Abe
6, Abel, Eb
7, Abel, Ebbie
8, Abiel, Ab
9, Abigail, Abby
10, Abigail, Gail
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜