Intern Problem Statement for a bank
I saw an intern opportunity in a bank in dubai. They have a defined problem statement to be solved in 2 months. They told us just 2 lines -
"Basically the problem is about name matching logic开发者_Go百科. There are two fields (variables) – both are employer names, and it’s a free text field. So we need to write a program to match these two variables."
Can anyone help me in understanding it? Is it just a simple pattern matching stuff? Any help/comments would be appreciated.
I think this is what they are asking for:
They have two sources of related data, for example, one from an internal database, and the other from name card input.
Because the two fields are free text fields, there will be inconsistency. For example, Nitin Garg
, or Garg, Nitin
, or Mr. Nitin Garg
, etc. Here is an extreme case of Gadaffi.
What you are supposed to do is to find a way to match all the names for a specific person together.
In short, match two pieces of data together by employer names, taking possible inconsistency into account.
Once upon a time there was a nice simple answer to the problem of matching up names despite mis-spellings and different transliterations - Soundex. But people have put a lot of work into this problem, so now you should probably use the results of that work, which is built into databases and add-ons - some free. See Fuzzy matching using T-SQL and http://anastasiosyal.com/archive/2009/01/11/18.aspx and http://msdn.microsoft.com/en-us/magazine/cc163731.aspx
精彩评论