database, requests, performance, cache

2023-02-05 06:01 问答作者：

I need some input on how to design a database layer.

In my application I have a List of T. The information in T have information from multiple database tables.

There are of course multiple ways to do this. Two ways that I think of is :

chatty database layer and cacheable:

List<开发者_高级运维;SomeX> list = new List<SomeX>();
foreach(...) {
    list.Add(new SomeX() {
        prop1 = dataRow["someId1"],
        prop2 = GetSomeValueFromCacheOrDb(dataRow["someId2"])
    });
}

The problem that I see with the above is that if we want a list of 500 items, it could potentially make 500 database requests. With all the network latency and that. Another problem is that the users could have been deleted after we got the list from the database but before we are trying to get it from cache/db, which means that we will have null-problems. Which we have to handle manually. The good thing is that it's highly cacheable.

non chatty but not cacheable:

List<SomeX> list = new List<SomeX>();
foreach(...) {
    list.Add(new SomeX() {
        prop1 = dataRow["someId1"],
        prop2 = dataRow["someValue"]
    });
}

The problem that I see with the above is that its hard to cache, since potentially all users have unique lists. The other problem is that it will be a lot of joins which could result in a lot of reads against the database. The good thing is that we know for sure that all information exists after the query is run (inner join etc)

non so chatty, but still cacheable

A third option could be to first loop through the data rows, and collect all necessary someId2 and then make one more database request to get all the SomeId2 values.

"The problem that I see with the above is that if we want a list of 500 items, it could potentially make 500 database requests. With all the network latency and that."

True. Could also create unnecessary contention and consume server resources maintaining locks as you iterate over a query.

"Another problem is that the users could have been deleted after we got the list from the database but before we are trying to get it from cache/db, which means that we will have null-problems."

If I take that quote, then this quote:

"The good thing is that it's highly cacheable."

Is not true, because you've cached stale data. So strike off the only advantage so far.

But to directly answer your question, the most efficient design, which seems to be what you are asking, is to use the database for what it is good for, enforcing ACID compliance and various constraints, most notably pk's and fk's, but also for returning aggregated answers to cut down on round trips and wasted cycles on the app side.

This means you either put SQL into your app code, which has been ruled to be Infinite Bad Taste by the Code Thought Police, or go to sprocs. Either one works. Putting the code into the App makes it more maintainable, but you'll never be invited to any more elegant OOP parties.

Some suggestions:

SQL is a set based language, so don't design things for iterating over loops. Even with stored procedures, still see cursors now and then when a set based query will solve the issue. So, always try and get the information with 1 query. Now sometimes this isn't possible but in the majority this will be. You can also design Views to make your querying easier if you have a schema with many tables to pull the information that is needed with one statement.

Use proxies. Let's say I have an object with 50 properties. At first you display a list of objects to the user. In this case, I would create a proxy of the most important properties and display that to the user, maybe 2 or three important ones like name, ID, etc. This cuts down on amount of information sent initially. When the user actually wants to edit or change the object, then make a second query to get the "full" object. Only get what you need. This is especially important over the web when serialization XML between the layers.

Come up with a paging strategy. Most systems work fine until they get a lot of data and then the query comes to a halt because it is reurning 1000s of data rows/records. Page early and often. If you are doing a web application, probably paging directly in the database will be the most performant because only the paged data is being sent between the layers.

Data caching depends on the data. For highly volatile data (changing all the time) caching isn't worth it. By for semi-volatile or non-volatile data, caching can be worth it, but you have to manage the cache either directly or indirectly if you are using a built in framework.

A good place to use a cache is say you have a zip codes table. Certianly, those don't change that often and you could cache those to boost performance if you had a zip code drop down in your application. This is just an example, but caching IMO depends on the type of data.

继续阅读：caching database performance

database, requests, performance, cache

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？