Optimize database for web usage (lots more reading than writing)

2022-12-17 09:34 问答作者：

I am trying to layout the tables for use in new public-facing website. Seeing how there will lots more reading than writing data (guessing >85% reading) I would like to optimize the database for reading.

Whenever we list members we are planning on showing summary information about the members. Something akin to the reput开发者_运维问答ation points and badges that stackoverflow uses. Instead of doing a subquery to find the information each time we do a search, I wanted to have a "calculated" field in the member table.

Whenever an action is initiated that would affect this field, say the member gets more points, we simply update this field by running a query to calculate the new values.

Obviously, there would be the need to keep this field up to date, but even if the field gets out of sync, we can always rerun the query to update this field.

My question: Is this an appropriate approach to optimizing the database? Or are the subqueries fast enough where the performance would not suffer.

There are two parts:

Caching
Tuned Query
1. Indexed Views (AKA Materialized views)
2. Tuned table

The best solution requires querying the database as little as possible, which would require caching. But you still need a query to fill that cache, and the cache needs to be refreshed when it is stale...

Indexed views are the next consideration. Because they are indexed, querying against is faster than an ordinary view (which is equivalent to a subquery). Nonclustered indexes can be applied to indexed views as well. The problem is that indexed views (materialized views in general) are very constrained to what they support - they can't have non-deterministic functions (IE: GETDATE()), extremely limited aggregate support, etc.

If what you need can't be handled by an indexed view, a table where the data is dumped & refreshed via a SQL Server Job is the next alternative. Like the indexed view, indexes would be applied to make fetching data faster. But data change means cleaning up the indexes to ensure the query is running as best it can, and this maintenance can take time.

The least expensive database query is the one that you don't have to run against the database at all.

In the scenario you describe, using a high-performance cache technology (example: memcached) to store query results in your application can be a lot better strategy than trying to trick out the database to be highly scalable.

The First Rule of Program Optimization: Don't do it.
The Second Rule of Program Optimization (for experts only!): Don't do it yet.

Michael A. Jackson

If you are just designing the tables, I'd say, it's definitely premature to optimize. You might want to redesign your database a few days later, you might find out that things work pretty fast without any clever hacks, you might find out they work slow, but in a different way than you expected. In either case you would waste your time, if you start optimizing now.

The approach you describe is generally fine; you could get some pre-computed values, either using triggers/SPs to preserve data consistency, or running a job to update these values time-to-time.

All databases are more than 85% read only! Usually high nineties too.

Tune it when you need to and not before.

继续阅读：sql-server

Optimize database for web usage (lots more reading than writing)

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？