开发者

Matching multiple variations of input to one sql row

I would like to know after much searching how I would match different variations of input to one sql row using standard TSQL. Here is the scenario:

I have in my sql row the following text: I love

I then have the following 3 inputs all of which should return a match to this row:

"I want to tell you we all love StackOverflow"

"I'm totally in love with StackOverflow"

"I really love StackOverflow"

As you can see I have bolded the reason for the match to try and make it clearer to you why they match. The I in I'm is deliberately matched too so it would be good if we could include that in matches.

I thought about splitting the input string which I done using the following TSQL:

-- Create a space delimited string for testing
declare @str varchar(max)
select @str = 'I want to tell you we all love StackOverflow'
-- XML tag the string by replacing spaces with </x><x> tags
declare @xml xml
select @xml = cast('<x><![CDATA['+ replace(@str,' ',']]></x><x><![CDATA[') + ']]></x>' as xml)
-- Finally select values from nodes <x> and trim at the same time
select ltrim(rtrim(mynode.value('.[1]', 'nvarchar(12)'))) as Code
from (select @xml doc) xx
cross apply doc.nodes('/x') (mynod开发者_如何转开发e)

This gets me all the words as separate rows but then I could not work out how to do the query for matching these.

Therefore any help from this point or any alternate ways of matching as required would be more than greatly appreciated!

UPDATE:

@freefaller pointed me to the RegEx route and creating a function I have been able to get a bit further forward, therefore +1 @freefaller, however I now need to know how I can get it to look at all my table rows rather than the hard-coded input of 'I love' I now have the following select statements:

SELECT * FROM dbo.FindWordsInContext('i love','I want to tell you we all love StackOverflow',30)
SELECT * FROM dbo.FindWordsInContext('i love','I''m totally in love with StackOverflow',30)
SELECT * FROM dbo.FindWordsInContext('i love','I really love StackOverflow',30)

The above returns me the number of times matched and the context of the string matched, therefore the first select above returns:

Hits    Context
1       ...I want to tell you we all love StackOv...

So based on the fact we now have the above can anyone tell me how to make this function look at all of the rows for matches and then return the row/rows that have a match?


One option would be to use Regular Expressions via SQLCLR objects as explained here.

I have never myself created SQLCLR objects, so cannot comment on the ease of this method. I am however, a great fan of Regular Expressions and would recommend their use for most text search / manipulation

Edit: In response to the comment, I have no experience of SQLCLR, but assuming you get that working, something like the following simple untested TSQL might work...

SELECT *
FROM mytable
WHERE dbo.RegexMatch(@search, REPLACE(myfield, ' ', '.*?')) = 1


I have managed to come up with an answer to my own question so thought I thought I would post here in case anyone else has similar requirements in the future. Basically it relies upon the SQL-CLR regular expression functionality and runs with minimal impact to performance.

Firstly enable SQL-CLR on your server if not already available (you need to be sysadmin):

--Enables CLR Integration
exec sp_configure 'clr enabled', 1
GO
RECONFIGURE
GO

Then you will need to create the assembly in SQL (Don't forget to change your path from D:\SqlRegEx.dll and use SAFE permission set as this is the most restrictive and safest set of permissions but won't go into detail here.) :

CREATE ASSEMBLY [SqlRegEx] FROM 'D:\SqlRegEx.dll' WITH PERMISSION_SET = SAFE

Now create the actual function you will call:

CREATE FUNCTION [dbo].[RegexMatch]
(@Input NVARCHAR(MAX), @Pattern NVARCHAR(MAX), @IgnoreCase BIT)
RETURNS BIT
AS EXTERNAL NAME SqlRegEx.[SqlClrTools.SqlRegEx].RegExMatch

Finally and to complete and answer my own question we can then run the following TSQL:

SELECT *
FROM your_table
WHERE dbo.RegexMatch(@search, REPLACE(your_field, ' ', '.*?'), 1) = 1
SELECT *
FROM your_table
WHERE dbo.RegexMatch(@search, REPLACE(REVERSE(your_field), ' ', '.*?'), 1) = 1

I hope this will help someone in what should be a simple search option in the future.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜