开发者

Best-practice caching: monolithic vs. fine-grained cache data

In a distributed caching scenario, is it generally advised to use or avoid monolithic objects stored in cache?

I'm working with a service backed by an EAV schema, so we're putting caching in place to minimize the perceived performance deficit imposed by EAV when retrieving all primary records and respective attribute collections from the database. We will prime the cache on service startup.

We don't have particularly frequent calls for all products -- clients call for differentials after they first populate their local cache with the object map. In order to perform that differential, the distributed cache will will need to reflect changes to individual records in the database that are performed on an arbitrary basis, and be processed for changes as differentials are called for by clients.

First thought was to use a List or Dictionary to store the records in the distributed cache -- get the whole collection, manipulate or search it in-memory locally, put 开发者_如何学JAVAthe whole collection back into the cache. Later thinking however led to the idea of populating the cache with individual records, each keyed in a way to make them individually retrievable from/updatable to the cache. This led to wondering which method would be more performant when it comes to updating all data.

We're using Windows Server AppFabric, so we have a BulkGet operation available to us. I don't believe there's any notion of a bulk update however.

Is there prevailing thinking as to distributed cache object size? If we had more requests for all items, I would have concerns about network bandwidth, but, for now at least, demand for all items should be fairly minimal.

And yes, we're going to test and profile each method, but I'm wondering if there's anything outside the current scope of thinking to consider here.


So in our scenario, it appears that monolithic cache objects are going to be preferred. With big fat pipes in the datacenter, it takes virtually no perceptible time for ~30 MB of serialized product data to cross the wire. Using a Dictionary<TKey, TValue> we are able to quickly find products in the collection in order to return, or update, the individual item.

With thousands of individual entities, all well under 1 MB, in the cache, bulk operations simply take too long. Too much overhead, latency in the network operations.

Edit: we're now considering maintaining both the entities and the monolithic collection of entities, because with the monolith, it appears that retrieving individual entities becomes a fairly expensive process with a production dataset.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜