开发者

Consolidating / Clustering Terms and phrases

Our application allows a user to enter company names that their organization works with. A current issue is that the way one user inputs the company name va开发者_JAVA技巧ries from user to user. We need to consolidate this data. Are there any proven approaches for tackling this problem?


The problem of data quality is generally referred to as Data Cleansing. There are many methods and tools in this area.

The best for you will depend on the extent of your problem and also on the technologies you use. But if I understand well, the data that are stored are OK, the problem is that user input data to search against with incorrect spelling? In this case fuzzy searching could help.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜