Replace with wildcard, in SQL

2022-12-28 11:55 问答作者：

I know MS T-SQL does not support regular expression, but I need similar functionality. Here's what I'm trying to do:

I have a varchar table field which stores a breadcrumb, like this:

/ID1:Category1/ID2:Category2开发者_如何转开发/ID3:Category3/

Each Category name is preceded by its Category ID, separated by a colon. I'd like to select and display these breadcrumbs but I want to remove the Category IDs and colons, like this:

/Category1/Category2/Category3/

Everything between the leading slash (/) up to and including the colon (:) should be stripped out.

I don't have the option of extracting the data, manipulating it externally, and re-inserting back into the table; so I'm trying to accomplish this in a SELECT statement.

I also can't resort to using a cursor to loop through each row and clean each field with a nested loop, due to the number of rows returned in the SELECT.

Can this be done?

Thanks all - Jay

I think your best bet is going to be to use a recursive user-defined function (UDF). I've included some code here that you can use to pass in a string to achieve the results you're looking for.

CREATE FUNCTION ufn_StripIDsFromBreadcrumb (@cIndex int, @breadcrumb varchar(max), @theString varchar(max))

RETURNS varchar(max)

AS

BEGIN
DECLARE @nextColon int
DECLARE @nextSlash int

SET @nextColon = CHARINDEX(':', @theString, @cIndex)
SET @nextSlash = CHARINDEX('/', @theString, @nextColon)
SET @breadcrumb = @breadcrumb + SUBSTRING(@theString, @nextColon + 1, @nextSlash - @nextColon)

IF @nextSlash != LEN(@theString)

     BEGIN
     exec @breadcrumb = ufn_StripIDsFromBreadcrumb @cIndex =  @nextSlash, @breadcrumb = @breadcrumb, @theString = @theString
     END
RETURN @breadcrumb
END

You could then execute it with:

DECLARE @myString varchar(max)
EXEC @myString = ufn_StripIDsFromBreadcrumb 1, '/', '/ID1:Category1/ID2:Category2/ID3:Category3/'
PRINT @myString

This works for SQL Server 2005 and up.

create table strings (
  string varchar(1000)
)

insert into strings values( '/ID1:Category1/ID2:Category2/ID3:Category3/' )  
insert into strings values( '/ID4:Category4/ID5:Category5/ID8:Category6/' )  
insert into strings values( '/ID7:Category7/ID8:Category8/ID9:Category9/' )  
go

with  
replace_with_wildcard ( restrung ) as 
( 
  select replace( string, '', '' ) 
  from strings

  union all 

  select  
    replace( restrung, substring( restrung, patindex( '%ID%', restrung ), 4 ), '' ) 
  from replace_with_wildcard 
  where patindex( '%ID%', restrung ) > 0 
) 

select restrung
from replace_with_wildcard 
where charindex( ':', restrung ) = 0
order by restrung

drop table strings

You might be able to do this using a Split function. The following split function relies on the existence of a Numbers table which literally contains a sequential list of numbers like so:

Create Table dbo.Numbers( Value int not null primary key clustered )
GO
With Nums As
    (
    Select ROW_NUMBER() OVER( Order By o.object_id ) As Num
    From sys.objects as o
        cross join sys.objects as o2
    )
Insert dbo.Numbers( Value )
Select Num
From Nums
Where Num Between 1 And 10000
GO  


Create Function [dbo].[udf_Split] (@DelimitedList nvarchar(max), @Delimiter nvarchar(2) = ',')
Returns @SplitResults TABLE (Position int NOT NULL PRIMARY KEY, Value nvarchar(max))
AS
/*
PURPOSE: to split the @DelimitedList based on the @Delimter
DESIGN NOTES:
    1. In general the contents of the next item is: NextDelimiterPosition - CurrentStartPosition
    2. CurrentStartPosition = 
        CharIndex(@Delimiter, A.list, N.Value)  = Current Delimiter position
        + Len(@Delimiter)                       + The number of delimiter characters 
        + 1                                     + 1 since the text of the item starts after the delimiter
    3. We need to calculate the delimiter length because the LEN function excludes trailing spaces. Thus
        if a delimiter of ", " (a comma followed by a space) is used, the LEN function will return 1.
    4. The DataLength function returns the number of bytes in the string. However, since we're using
        an nvarchar for the delimiter, the number of bytes will double the number of characters.
*/
Begin
    Declare @DelimiterLength int
    Set @DelimiterLength = DataLength(@Delimiter) / 2

    If Left(@DelimitedList, @DelimiterLength) <> @Delimiter
        Set @DelimitedList = @Delimiter + @DelimitedList

    If Right(@DelimitedList, @DelimiterLength) <> @Delimiter
        Set @DelimitedList = @DelimitedList + @Delimiter

    Insert @SplitResults(Position, Value)
    Select CharIndex(@Delimiter, A.list, N.Value) + @DelimiterLength            
        , Substring (
                    A.List
                    , CharIndex(@Delimiter, A.list, N.Value) + @DelimiterLength         
                    , CharIndex(@Delimiter, A.list, N.Value + 1)                            
                        - ( CharIndex(@Delimiter, A.list, N.Value) + @DelimiterLength ) 
                    )
    From dbo.Numbers As N
        Cross Join (Select @DelimitedList As list) As A
    Where N.Value > 0
        And N.Value < LEN(A.list)
        And Substring(A.list, N.Value, @DelimiterLength) = @Delimiter
    Order By N.Value

    Return
End

You then might be able to run a query like so where you strip out the prefixes:

Select Table, Substring(S.Value, CharIndex(':', S.Value) + 1, Len(S.Value))
From Table
    Cross Apply dbo.udf_Split(Table.ListColumn, '/') As S

This would give you values like:

Category1
Category2
Category3

You could then use FOR XML PATH to combine them again:

Select Table.PK
    ,   Stuff(  (
                Select '/' + Substring(S.Value, CharIndex(':', S.Value) + 1, Len(S.Value))
                From Table As Table1
                    Cross Apply dbo.udf_Split(Table.ListColumn, '/') As S1
                Where Table1.PK = Table.PK
                Order By S1.Position
                For Xml Path('')
                ), 1, 1, '') As BreadCrumb
From Table

For SQL Server 2005+, you can get regex support by:

Enabling CLR (doesn't require instance restart)
Uploading your CLR functionality (in this case, regex replace)

Using native TSQL, you'll need to define REPLACE statements for everything you want to remove:

SELECT REPLACE(
         REPLACE(
           REPLACE(''/ID1:Category1/ID2:Category2/ID3:Category3/'', 'ID1:', ''),
            'ID2:', ''), 
         'ID3:', '')

Regex or otherwise, you need to be sure these patterns don't appear in the actual data.

You can use SQL CLR. Here's an MSDN article:

declare @test1 nvarchar(max)
set @test1='/ID1:Category1/ID2:Category2/ID3:Category3/'
while(CHARINDEX('ID',@test1)<>0)
Begin
select @test1=REPLACE(@test1,SUBSTRING(@test1,CHARINDEX('ID',@test1),CHARINDEX(':',@test1)-
CHARINDEX('ID',@test1)+1),'') 
End
select @test1

继续阅读：sql sql-server string tsql

Replace with wildcard, in SQL

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？