Db performance: multiple connections vs joins/subselects
Let's say I need to issue a number of queries against a relational database, which of the following would be better from performance perspective:
Issue them one by one, being able to use data from the first ones as input in the secondary ones (meaning multiple connections made, but less joins/subselects).
Batch the commands together (meaning only one connection but more joins/subselects in the actual queries).
I'm hoping and thinking number 2 here, but would like it confirmed and maybe some arguments to back it up...
I'm using SQL Server 2008, but I guess this question should be generic for most db platforms(?).
EDIT: I'm aware of the fact that this is a very general question and as such I'm lookin开发者_开发技巧g for general input/answers.
I would avoid a chatty design (multiple DB round trips) as much as possible. Network latency is such a performance killer that given the 2 choices you provided, option #2 is the easy choice. However, I'm assuming the joins/subselects are not out of the ordinary.
There's really no one-size-fits-all answer to this question.
Sometimes it's better to write several simpler queries to get the result incrementally as you describe in 1.
Sometimes it's better to write a complex SQL query as in 2, and let the RDBMS' optimizer decide how to get the result most efficiently.
Without knowing anything about the nature of the problems you're trying to solve, the most general advice is that you should try both techniques, and measure the performance against a realistic data set. Measurements include both raw timings, and also analyzing a report of the optimizer plan.
SQL Server 2008 has tools to help analyze queries and give recommendations for performance optimization, too.
精彩评论