Find "Similar" Part Numbers
I hope you'll find this interesting! C# is the preferred language for solutions, or T-SQL.
Consider the following items:
Item 1: NJ2-12GM50-Wö-V13
Item 2: NJ2-12GM50-Wo-V13
You can guess that the individual entering Item 1 copied and pasted, and the individual entering Item 2 just used 'o', the closest english character he could find.
When somebody enters a new part number like NJ2-12GM50-Wo-V13, we want to suggest to them that they might mean NJ2-12GM开发者_高级运维50-Wö-V13.
In order to do so, we want to convert all characters that can be stored as a VARCHAR back to their simplist lower case letter. I'm interested in a solution that converts, for example, ASCII character 246 (ö) to ASCII character 111 (o).
The idea being that if we make a lookup column containing all the converted versions of the partnumbers, we can easily match them to keyboard entry.
Of course, I can make a Dictionary to do it, but I wonder if there is a smarter way.
Thanks!
Probably your best bet is to implement a function in either C# or T-SQL to calculate the Levenshtein distance between the two part numbers.
Use the COLLATE clause to coerce the strings to both case and accent insensitive
IF 'NJ2-12GM50-Wö-V13' COLLATE Latin1_General_CI_AI
=
'NJ2-12GM50-Wo-V13' COLLATE Latin1_General_CI_AI
PRINT 'matches'
ELSE
PRINT 'no match
So, you can use it something like to validate user input
SELECT
PartNo AS DidYouMeanThis,
@Input AS WhenYouEnteredThis
FROM
MyPartNumbers
WHERE
PartNo COLLATE Latin1_General_CI_AI = @Input COLLATE Latin1_General_CI_AI
using a different COLLATE you can then ensure an exact match on write...
IF NOT EXISTS (SLECT * FROM MyPartNumbers
WHERE PartNo COLLATE Latin1_General_BIN = @Input Latin1_General_BIN)
RAISERROR ('Oi! I asked DidYouMeanThis', 16, 1)
ELSE
INSERT ...
I can't add this as a comment for some reason, but you may consider, instead (or as well as), a matching of "commonly searched for" mappings. Such a system would probably be useful in general (i.e. "red hanger -> AB-999X", etc) and may solve this problem for you. Worth considering, anyway.
精彩评论