开发者

Extract substring from string in SQL

I need to extract a text that is surrounded by ***[some te开发者_运维百科xt] strings, like in the following example:

some text
some text
***[some text]
THIS SHOULD BE EXTRACTED
***[some text]
some text
some text
some text
some text
some text
***[some text]
THIS SHOULD BE EXTRACTED TOO
***[some text]
some text

the output should be:

THIS SHOULD BE EXTRACTED
THIS SHOULD BE EXTRACTED TOO

I tried PATINDEX like here, but couln't find the way to extract the string.

PATINDEX('%[*][*][*][[]%]%%[*][*][*][[]%]%',@Text)

I am looking forward to hearing any suggestions.


For the somewhat easier case raised in the comments you could do

;WITH T(C) AS
(
 SELECT '
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED
    ***[some text 2]
    some text
    some text
    some text
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED TOO
    ***[some text 2]
    some text'
)
SELECT col.value('.','varchar(max)')
FROM T
CROSS APPLY (SELECT CAST('<a keep="false">' + 
                        REPLACE(
                            REPLACE(C,'***[some text 2]','</a><a keep="false">'),
                        '***[some text 1]','</a><a keep="true">') + 
                    '</a>' AS xml) as xcol) x
CROSS APPLY xcol.nodes('/a[@keep="true"]') tab(col)


Not a regex solution and I'm still a SQL novice so may not be optimal but you should be able to parse with a WHILE loop using

CHARINDEX for the *** then using that as a starting point to
CHARINDEX to the LF Use that as the starting point for a
SUBSTRING with the ending point being a CHARINDEX of the next ***
concatenate the Substring to your output, move past the ending *** and loop to find the next one.

I'll play with it some and see if I can add an example.
EDIT - This probably needs more error checking

declare @inText nvarchar(2000) = 'some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED 
***[some text] 
some text 
some text 
some text 
some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED TOO 
***[some text] 
some text '

declare @delim1 nvarchar(50) = '***'
declare @delim2 char = char(10)
declare @output nvarchar(1000) = ''
declare @position int
declare @positionEnd int

set @position = CHARINDEX(@delim1,@inText)
while (@position != 0 and @position is not null)
BEGIN
  set @position = CHARINDEX(@delim2,@inText,@position)
  set @positionEnd = CHARINDEX(@delim1,@inText,@position)
  set @output = @output + SUBSTRING(@inText,@position,@positionEnd-@position)
  set @position = CHARINDEX(@delim1,@inText,@positionEnd+LEN(@delim1))
END
select @output


You can find this in my blog: http://sql-tricks.blogspot.com/2011/04/extract-strings-with-delimiters.html It is pure solution with no additional modification, only delimiters sequences should be decalred.


I may be wrong but I don't think there's a clean way to do this directly in SQL. I would use a CLR stored procedure and use regular expressions from C# or your .NET language of choice.

See this article (or this article) for a relevant example using regexes.


I believe you can use the xp_regex_match as described in http://www.codeproject.com/KB/mcpp/xpregex.aspx?q=use+sql+function+to+parse+text to parse your nvarchar field. I wrote something similar quite a while back.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜