SQL 2005 CTE vs TEMP table Performance when used in joins of other tables
I have a complex query that I need to use in a subsequent query (actually update statement). I have tried both using a CTE and a temp table. The performance using the CTE is horrible vs the temp table approach. Its something like 15 seconds vs milliseconds. To simplify the test instead of joining the CTE/Temp table in the subsequent query I simply s开发者_如何学Celected * from it. In that case they perform the same.
I Have Looked at The Execution Plan for both approaches both with the joins in the subsequent query and then simply select *. With the simple select the query plans are about the same, but with the joins in the subsequent select the query plans are not. Specifically the portion of the query plan for creating and populating the temp table stays the same, while the query plan portion for creating and populating the CTE changes dramatically when it is subsequently used in a query with a join.
My question is why does the query plan for the creation and population of the CTE change by how it is subsequently used while the temp table is not. Also in what scenarios then would a CTE yield better performance than a temp table?
*Note I have used a table variable as well and it is comparable to the temp table approach.
Thanks
CTE
is just an alias for the query.
It may (or may not) be rerun each time it's used.
There is no clean way to force CTE
materialization in SQL Server
(like Oracle's /*+ MATERIALIZE */
), and you have to do dirty tricks like this:
- Generating XML in subqueries
CTE
may improve performance if used in plans requiring only one evaluation (like HASH JOIN
, MERGE JOIN
etc.).
In these scenarios, the hash table will be built right from the CTE
, while using the temp table will require evaluating the CTE
, pulling the results into the temp table and reading the temp table once again.
You're asking a complicated question, so you're getting a complicated answer: it depends. (I hate that response).
Seriously, however, it has to do with how the optimizer chooses a data plan (which you knew already); a temp table or variable is like a permanent structure in that an execution plan will perform the operation associated with filling that structure first, and then use that structure in subsequent operations. A CTE is NOT a temp table; use of the CTE is not calculated until it is being used by subsequent operations, and so that usage impacts how the plan is optimized.
CTE's were implemented for reusability and maintenance issues, not necessarily performance; however, in many cases (like recursion), they will perform better than traditional coding methods.
I find that typically a repeated CTE gets no performance improvements.
So for instance, if you use a CTE to populate a table and then the same CTE to join to in a later query, no benefit. Unfortunately, CTEs are not snapshots and they literally have to be repeated to be used in two separate statements, so they tend to be evaluated twice.
Instead of CTEs, I often use inline TVFs (which may contain CTEs), which allows proper re-use, and are not any better or worse than CTEs in my SPs.
In addition, I also find that the execution plan can be bad if the first step alters the statistics such that the execution plan for the second step is always inaccurate because it is evaluated before any steps are run.
In this case, I look at manually storing intermediate results, ensuring that they are indexed properly and splitting up the process into multiple SPs and adding WITH RECOMPILE to ensure that later SPs have plans that are good for the data which they are actually going to operate on.
I tried creating CTE with simple selected with filter from big table Then 3 times subqueried it.
After that do the same with temporary tables.
The result was 70% time consuming for CTE -30% time consuming for temp table. So temp table is better for that solutions.
I don't think CTE makes a temp table only with selected query, but 3 times make select to a big table.
精彩评论