开发者

Is caching results by querying every 10 seconds a bad Idea?

We have a situation where we are calling a function on a API that is rather expensive. Lets call it API.ExpensiveCall()

This ExpensiveCall() is called frequently within a web application. Although it is not noticeable with one user, it becomes noticeable when you have 5 or so simultaneous users.

We want to cache these results. But since API.ExpensiveCall() is essentially a black box to us, we have no way of invalidating our cache, to know when to refresh.

So our proposed solution was to create a windows service, which would simply call this API.ExpensiveCall() every ten seconds, then save the results into a local database that the web application is already using.

This way it would not matter if there is 1 user or 20+ users on the website, the ExpensiveCall() is only called every 10 seconds. It is a 开发者_JS百科very controlled load on the external system the API.ExpensiveCall() is connected to.

The problem

Our Project manager does not agree with this. For some reason he thinks a timed refresh every 10 seconds is a bad idea, because in his opinion puts too much load on the external system.

But if we don't do anything about it, and leave things the way they are without any sort of caching, it not only degrades the performance of the web application, but it will definitely cause way more than one ExpensiveCall per second on the external system. And that number would multiply depending on the number of users on the web application.

I would like to ask you is this way of caching really such a bad idea? Have you heard of other systems using such a method for caching? And if it is such a bad idea, are there any alternative better ways of caching results from system when it is a black box to you?

EDIT:

Your responses seem to indicate that I should be using the timeout feature of ASP.Net's memory caching mechanism.

I like the timeout idea. The only (small) issue I see with it now is that when the timeout expires and it is time to call the ExpensiveCall(), it will be a blocking call. As opposed to querying a local table, which is kept up to date by constantly refreshing in a separate process. This is the thing I find attractive with the polling idea. Although I must admit it does feel weird to be polling every 10 seconds, which is why I'm on the fence about it.


Your project manager is probably right: When there are no requests for five minutes straight, you'll still be polling, causing 30 unnecessary calls to the expensive function.

The way these things are usually resolved goes like this: Whenever you need the data from the expensive call, you check the cache. If it's not there, then you call the expensive function and store the data in the cache, adding a timestamp. If it is, then you check how old the data is (hence the timestamp). If it's older than 10 seconds, call the expensive function anyway, and update your cache. Otherwise, use the cached data. ASP.NET's built-in cache API lets you do all this with fairly minimal effort - just set the cache expiration to 10 seconds, and you should be good.

This way, you'll get the best of both worlds - your data is never older than 10 seconds, but you still avoid the constant polling.


If "it is not noticeable with one user" then maybe you should let the first call to the method cache the results, then expire the cache every X seconds. That way only one call every X seconds takes the performance hit.

I'd agree that hitting an external service every 10 seconds sounds at best uncool and at worst a denial of service attack (depending on the weight of the call and the capacity of the provider).


There's nothing fundamentally wrong with using a timespan (e.g. 10 seconds) to invalidate cache.

If the consequence of a user seeing data that is as old as 10 seconds out of date is acceptable from a business perspective, it's a perfectly reasonable solution. You probably don't want to actually expire the cache until you get your first request after 10 seconds have elapsed (given that the impact to a single user is not very significant... if the delay processing the request for that user is noticeable, it's fine to pre-expire the cache).


Take a look at my response to this question - it describes a way to ensure fresh data in a standard cache. Seems like it might directly address your situation as well.


It sounds like this could be a singleton operation (e.g. all requests will use the same result of the operation). If this is the case, it also sounds like something that could be cached in HttpRuntime.Cache.

var mutex = new Mutex();

public IEnumerable<MyData> GetMyData()
{
  mutex.WaitOne();
  var cache = HttpRuntime.Cache;
  var data = cache["MyData"] as IEnumerable<MyData>;
  if (data == null)
  {
    data = API.ExpensiveCall();
    cache.Add("MyData", data, null, 
      DateTime.Now.AddSeconds(60), Cache.NoSlidingExpiration, CacheItemPriority.High, null);
  }

  mutex.ReleaseMutex();
  return data;
}

In this sense, you call GetMyData which does a check against the application cache, and if the data does not exist (or has expired, and this been removed), we make our expensive call, cache the result and return the data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜