How can I optimize/refactor a TSQL "LIKE" clause?
I have a table with 117000 or so records. I need to perform a search that checks 3 开发者_JAVA技巧separate fields for a given string pattern.
My where clause is as follows:
field1 LIKE '%' + @DESC + '%'
OR field2 LIKE '%' + @DESC + '%'
OR field3 LIKE '%' + @DESC + '%'
This seems to take about 24 seconds regardless of input...
Is there a better way to do this? Less than 10 (or 5!) seconds would be much more preferable.
Thanks for any help.
Use Full Text Search and CONTAINS. LIKE cannot be optimized when searching in the middle of the field, ie. when the LIKE expression starts with an '%', so it will always do a full table scan.
Anytime you start a LIKE search with a wildcard, you're doing a scan. Unless you can narrow your search criteria to include the first character (which may not be feasible), you'll need to resort to Full Text Search.
Do you really need to start with a wildcard? Why? Often you can force users to type in the first character at least. I bring this up becasue some developers just use the wildcard as a habit not becasue there is a requirement. In most cases users will be able to type the first character unless the filed stores long strings (like say official airport names). Otherwise you really need to use full-text indexing although KM's trick with the reverse is pretty cool if you don't need the wildcard at the end.
If you can avoid doing the performance killing things, then do so.
Whilst I agree with the accepted answer that Full Text Indexing would be the best solution and am by no means advocating the use of leading wildcard searches if they have to be performed then there are potential steps that can be taken to make the performance of them less bad.
Kalen Delaney in the book "Microsoft SQL Server 2008 Internals" says:
Collation can make a huge difference when SQL Server has to look at almost all characters in the strings. For instance, look at the following:
SELECT COUNT(*) FROM tbl WHERE longcol LIKE '%abc%'
This may execute 10 times faster or more with a binary collation than a nonbinary Windows collation. And with
varchar
data, this executes up to seven or eight times faster with a SQL collation than with a Windows collation.
I tried one possible solution. Before this solution even query was not returning result and causing connection timeout error.
My query was having date filter and other criteria. All other criteria was like search. One column keyword was searching like '%abc%' on ntext column and it was doing full table scan.
Solution:
Divide query in 2 parts. 1) First part in CTE (Common Table Express) 2) Apply all search criteria on CTE.
WITH SearchData(Column1,Column2,Column3,Column4,........)
AS
(
SELECT Column1,Column2,Column3,Column4,...........
FROM myTable1 WITH(NOLOCK)
INNER JOIN MyTable2 WITH(NOLOCK)
ON MyTable1.id = MyTable2.Id
WHERE (MyTable1.CreationTime >= '2014-04-27' AND MyTable1.CreationTime <= '2014-05-01')
)
SELECT DISTINCT top 250 Column1,Column2,Column3,Column4
FROM SearchData
WHERE (ISNULL(Column1,'') LIKE @Column1 +'%' OR @Column1 IS NULL)
and (Column2 LIKE @Column2+ '%' OR @Column2 IS NULL)
...
...
...
...
AND (Column10 like '%'+@Column10+'%' or @Column10 IS NULL)
AND @Column1+@Column2+@Column3+........@Column10 <> ''
ORDER BY [CreationTime] DESC
It worked for me.
how about
field1 + field2 + field3 LIKE '%' + @DESC + '%'
or
CONTAINS(field1 + field2 + field3, @DESC)
If you can't use FullTextSearch you can increase speed in 10 times. Do next:
1 Add calculated field:
alter table TableName
add CalculatedColumnName as upper(Column1 + '|' + Column2...) collate Latin1_General_100_Bin2
persisted;
2 Add index for calculated field:
create nonclustered index IDX_TableName_CalculatedColumnName
on TableName(CalculatedColumnName);
3 Change your query text
select count(*)
from TableName
where CalculatedColumnName like '%' + upper(@ParameterValue) + '%' collate Latin1_General_100_Bin2
Source: http://aboutsqlserver.com/2015/01/20/optimizing-substring-search-performance-in-sql-server
精彩评论