SQL Server 2008 + expensive union all
We have 5 tables over which we should query with user search input throughout a stored procedure. We do a union all of the similar data inside a view.开发者_如何学运维 Because of this the view can not be materialized. We are not able to change these 5 tables drastically (like creating a 6th table that contains the similar data of the 5 tables and reference that new one from the 5 tables).
The query is rather expensive / slow
What are our other options? It's allowed to think outside the box. Unfortunately I cannot give more information like the table/view/SP definition because of customer confidentiality...
UNION ALL
is literally one of the cheapest operations that exists in SQL Server. It simply takes the result sets from each candidate query and concatenates the results. UNION
is different, it needs to perform a DISTINCT
sort, but if you are using UNION ALL
then I can guarantee you that the concatenation itself is not the problem.
There are a few reasons why your query might be performing poorly:
The underlying queries being concatenated are inefficient, due to poor indexing or other factors. Post the execution plan.
You're applying the
UNION ALL
before the predicates, i.e. assuming thatUNION ALL
obeys the distributive law (it doesn't). This turns a simple post-concatenation into a table concatenation and very expensive table scan.The plan itself is not slow, but you are selecting a huge number of results. If so, then there's not much you can do other than to try to limit the result size (normalize?).
Edit - I just reread the question and noticed this line:
We do a union all of the similar data inside a view
The problem is definitely #2 - what you're essentially doing is concatenating the entire tables and then applying predicates, which destroys any chance the optimizer might have had of effectively using any indexes on the underlying tables. Every query will end up as 5 expensive table scans.
The only way you're going to solve this is by either materializing the view (which you say you can't do) or getting rid of it and substituting a stored procedure or UDF. You need to treat individual tables individually for best performance; concatenating them all together and trying to query against the concatenated results will always be (very) slow.
Try changing the UNION to UNION ALL
UNION is more expensive than UNION ALL because it will remove duplicates
Also do you have indexes on those columns and are the queries that you are writing SARGable, if you for example have to look in the middle of a column then indexes won't be used
You need to classify your tables in order of importance and level. You start with your basic table and you build on top of that.
Depending on the size of the other tables, and they information they pertain you join them differently. The order in which you join also matters.
If you could please post a little more about your tables, what they contain, how they are used etc, I would be able to give u an example.
More specifically, do all 5 tables contain the same information, and u join them, to query a 'super set' of data? or are they 5 tables with various attributes that together make a resultset that you need?
It all depends a bit on the way you query the view, and what kind of data is in there => do the 5 tables contain 'separate' data, or is there overlap that needs to be removed again from the end-result ? (if there is no overlap, make sure to use UNION ALL)
Assuming you query the view using WHERE statements :
Assuming the queries are 'predictable', make sure to have proper indexes
If there is no overlap between the tables, are there fields that clearly distinguish what data comes from which table ? You might want to add CHECK constraints on said tables and try to make the view act like an "old-school" partitioned view. (see here)
And as always, make sure statistics are up-to-date, run a manual update after big (or many) changes to a table (simply running sp_updatestats now end then never hurts IMHO) !
精彩评论