SQL Server: Are temp tables or unions better?
Generally speaking, for combining a lot of data is it better to use a temp table/temp variable as a staging area or should I just stick to "UNION ALL"?
Assumptions:
- No further processing is needed, the results are sent directly to the client.
- The client waits for the complete recordset, so streaming results isn't nece开发者_如何学JAVAssary.
I would stick to UNION ALL. If there's no need to do intermediary processing, thus requiring a temp table, then I would not use one.
Inserting data into a temp table (even if it's a table variable which despite the myths, is not a purely "in memory" structure) will involve work in tempdb (which can be a bottleneck). To then just SELECT * as-is and return it without any special processing is unnecessary and I think bloats the code. When you just need to return data without any special processing, then a temp table approach seems a bit "round the houses". If I thought there was a reason to justify the use of a temp table, I would run some like-for-like performance tests to compare with vs without temp tables - then compare the stats (duration, reads, writes, CPU). Doing actual performance tests is the best way to be as confident as possible that whatever approach you choose, is the best. Especially as you don't have to be using temp tables for there to be work pushed over into tempdb - i.e. depending on your queries, it might involve work in tempdb anyway.
To clarify, I'm not saying one is better than the other full stop. As with most things, it depends on scenario. In the scenario described, it just sounds like you'd be adding in an extra step which doesn't seem to add any functional value and I can't see you'd gain anything other than creating a slightly more complicated/lengthy query.
One advantage with temp tables i can think of is that you can apply indexes to them. So that should help when dealing with lots of data where you need to get results back as quick as possible.
For what it is worth, I just did a performance comparison between two approaches to retrieve identical datasets:
SELECT c1, c2, c3 FROM ... ON ... WHERE
UNION ALL
SELECT c1, c2, c3 FROM ... ON ... WHERE /*(repeated 8 times)*/
vs
CREATE TABLE #Temp (c1 int, c2 varchar(20), c3 bit)
INSERT INTO #Temp (c1, c2, c3) SELECT (c1,c2,c3) FROM ... WHERE... /*(repeat 8 times)*/
SELECT c1, c2, c3 FROM #Temp
The second approach (the temporary table) was about 5% slower than the union, and when I artificially scaled up the number of repeats, the second approach became even slower.
Not specific to union all..
Use of temp table might have an advantage from a concurrency POV depending on query, isolation level and performance of clients/net link where use of a temp table could serve to minimize read lock times. Just don't use SELECT ..INTO.. to create the table.
In the general case UNION ALL avoids overhead of an unecessary work table.
I tend to use only UNION ALL where I have a limited number of UNIONS - and a relatively limited number of columns returned, table typed variables are another possibility (especially for 2014 on) - and allow you to enforce commonality of structure if similar result sets are built in more than one location.
UNION ALL avoids intermediate steps but: 1) it can lead to bloated, hard to maintain code 2) it can lead to unmanageable query plans - if they get too big then the plan viewing tools in sql server can't actually view them 3) if parts of a complex union are similar, or may be used elsewhere in your system consider using table valued functions or stored procs to facillitate code re-use whether you go for TTV, UNION ALL or Temp Tables
精彩评论