开发者

Tsql - performing a join on a delimited column - performance and optimisation issue

I have the following (slightly simplified in the columns returned) query.

select Products.Product, Products.ID, Products.Customers
from Products
where Products.orderCompleteDate is null

This would return, as an example

productA  1  Bob
productA  1  Jane
productB  2  John,Dave

Note that Customers can be a comma delimited list. What I want to add, is a column 'Customer Locations', so the above becomes

productA  1  Bob        Ireland
productA  1  Jane       Wales
produc开发者_高级运维tB  2  John,Dave  Scotland,England

I created a function below, where fn_split returns a single row per delimited item.

create FUNCTION [dbo].[GetLocations]  (@CustomerNames Varchar(256) )   

RETURNS @TempLocations table (CustomerLocations varchar(256)) AS begin
declare @NameStr varchar(256)  
declare @temp table(singleLoc varchar(256))

insert into @temp
select CustomerLocation.Location from CustomerLocation
INNER JOIN Customers ON Customers.ID = CustomerLocation.ID
INNER JOIN dbo.fn_Split(@CustomerNames,',') split ON split.Item = Customers.Name

SELECT @NameStr = COALESCE(@NameStr + ',', '') + singleLoc 
FROM @temp 

insert into @TempLocations values (@NameStr)
return
end

And applied it to the original query as follows

select Products.product, Products.ID, Products.Customers, Locations.CustomerLocations
from Products
OUTER APPLY dbo.GetLocations(Products.Customers,',') AS Locations
where Products.orderCompleteDate is null

However, this is extremely slow, with the query taking ~10seconds on a table with a mere 2000 rows (initial query runs almost instantly). This suggests that the query was unable to be optimised, and is being generated row by row. I stayed away from scalar value functions for this reason, and tried to stick to table value functions. Is there any glaring fault in my logic/code?


I'd normally suggest creating a view, based on the unnormalized table, that does the normalization, and then use that as the basis for any future queries. Unfortunately, I can't identify a PK for your current Products table, but you'd hopefully create this view using schemabinding, and hopefully be able to turn it into an indexed view (indexing on PK + customer name).

Querying this view (using Enterprise Edition, or the NOEXPAND option) should then give you comparable performance as if the normalized table existed.


One option would be to create a second table that normalises the product table and keeps it in sync with triggers that call the split function when inserting rows.

Pros are you get standard performance and easy SQL queries

Cons are potential for tables going out of sync should anything go wrong (can always schedule a job to rebuild new table from scratch periodically)

Obviously best answer would be to redesign product table but assume that's not possible for you to be messing with split functions etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜