When optimizing database queries, what exactly is the relationship between number of queries and size of queries?

2023-01-03 08:45 问答作者：

To optimize application speed, everyone always advises to minimize the number of queries an application makes to the database, consolidating them into fewer queries that retrieve more wherever possible.

However, this also always comes with the caution that data transferred is still data transferred, and just because you are making fewer queries doesn't make the data transferred free.

I'm in a situation where I can over-include on the query in order to cut down the number of queries, and simply remove the unwanted data in the application code.

Is there any type of a rule of thumb on how much of a cost there is to each query, to know when to optimize number of queries versus size of queries? I've tried to Google for objective performance analysis data, but surprisingly haven't been able to find anything like that.

Clearly this relationship will change for factors such as when the database grows in size, making this somewhat individualized, but surely this is not so individualized that a broad sense of 开发者_StackOverflowthe landscape can't be drawn out?

I'm looking for general answers, but for what it's worth, I'm running an application on Heroku.com, which means Ruby on Rails with a Postgres database.

I'm firmly in the "get only what you need when you need it" camp.

Retrieving extra rows that you may or may not need (lets say, retrieving full order details when loading an order summary screen, just in case the user drills down) just results in a much more complex query, probably joining tables that won't be used most of the time.

As a DBA, the hardest queries to optimize are the ones that join a large number of tables together.

Retrieving extra columns isn't quite as bad, but sometimes the server can retrieve just a few key columns directly from a "covering index" rather than having to retrieve all columns from a base table.

I think the key to the advice you've heard is to not make unnecessary round trips when you can get it all at once, instead of what it sounds like you are saying "get extra data just in case you need it".

Developers are so used to "modularizing" everything, its not at all unusual to actually end up with a final web page that makes hundreds or even multiple thousands of calls to the database to load the web page just once. We have a commercial product in-house that we've measured makes over 50,000 calls to the database for a single action.

For an example (somewhat contrived), lets say you have an "Order Summary" page that includes an "order total" field, which is the sum of all items in the "Order Detail" table. The wrong approach is:

Retrieve the list of orders from the Order Header table
Programatically loop through the orders
For each order, execute a query to retrieve all order detail records
Programatically add up the order items to get the total, which is displayed in the grid

Sounds crazy, right? This is more common than you think, especially when you build data-bound logic into individual web-components. Much more efficient:

Make a single call to the database, which a query something like:

SELECT oh.OrderID, oh.OrderDate, SUM(od.LineTotal) as OrderTotal
FROM OrderHeader oh
INNER JOIN OrderDetail od on oh.OrderID = od.OrderID

Display the results in the grid.

If you're looking for a rule of thumb: whenever possible, filter, sort and page in the database query. The database is optimized for these kinds of operations (set operations).

Application code is best reserved for true business logic (and display logic, etc.).

继续阅读：database optimization performance size

When optimizing database queries, what exactly is the relationship between number of queries and size of queries?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？