Basic question: Querying data and performance tradeoffs

2023-03-23 22:17 问答作者：

Let's say I have 100 rows in my table, with 3 columns of numbers. I don't need all the rows, only about half of them every time I fetch data. I only want the rows that have updated as getting the rest would be redundant.

Is it better to add a field and give it a datetime field to represent that it has updated since the last time I've fetched it (and use that as a criteria when SELECTing)? Or would it be better to simply download all the data each and every time (currently the data is being sent back as a JSON file).

What are the tradeoffs in terms of开发者_高级运维 speed, bandwidth usage, and server cpu usage between these two options? Is the former just plain better than the latter?

Both Jens Struwe and roycl are right - but as you're asking a hypothetical question, you're going to get answers that are right and contradictory.

If only half the data is relevant, how is the client going to determine which data to show? If the decision can be made by software at all, it's more efficient to do it on the database - but it's also more logical.

With tables of 100 rows, performance is neither here nor there; maintainability and long-term upgradability is a far bigger deal. Most developers would expect a logical database design, and sorting/filtering to be done on the DB rather than the client.

Always (or at least if possible) select only data that you need to accomplish your task. Vice versa: Never select data that you have to filter out. In result: Add a timestamp field for the updates and select only these rows whose timestamp is > than the given one.

With a 100 rows in your table and 3 columns of numbers it really doesn't matter which approach you use if you don't mind if the server returns the data in less than a few 10s of milliseconds. The rows, if queried frequently, will all be in memory anyway. It also makes your json code simpler and your client code dumber (which is probably good, and more maintainable).

If you had a several-million row table with only a small percentage of data that was required, you would naturally want to limit the return set, and the easiest way of doing that is with an SQL WHERE clause, such as WHERE dt_modified > my_timestamp. On a properly optimised database even this query could come in at well under 100ms.

The issue may be more to do with time the data spends "on the wire", how much time the client spends either regenerating the page, or updating it based on the returned data. Client processing tim is often the slowest part of the process. Only testing on different browsers and over different network speeds will find the best balance between server-side tweeks, network fixes (such as gzipping to compress data) and optimising your javascript calls.

继续阅读：database database-design json

Basic question: Querying data and performance tradeoffs

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？