Efficient cache synchronization

2023-01-21 17:12 问答作者：

Consider this

public Object doGet() {
    return getResource();
}

private Object getResource() {
    synchronized (lock) {
        if (cachedResourceIsStale()) {
            downloadNewVersionOfResource();
        }
    }

    return resource;
}

Assuming that doGet will be executed concurrently, and a lot at that, and that downloading the new version of the resource takes a while, are there more efficient ways to do the synchronization in getResource? I know about read/write locks but I don't think they can be applied here.

Why synchronize at all? If the cache goes stal开发者_开发知识库e, all threads accessing the resource while it's still being refreshed by the first one, will execute their own refresh. Among other problems this causes, it's hardly efficient.

As BalusC mentions in the comments, I'm currently facing this issue in a servlet, but I'm happy with generic answers because who knows in what situation I'll run into it again.

Assumptions

efficient means that doGet() should complete as quickly as possible
cachedPageIsStale() takes no time at all
downloadNewVersionOfResource() takes a little time

Answer

Sychronizing reduces network load, because only one thread fetches the resource when it expires. Also, it will not unduly delay processing of other threads - as the VM contains no current snapshot the threads could return, they'll have to block, and there is no reason an additional concurrent downloadNewVersionOfResource() will complete any faster (I'd expect the opposite due to network bandwith contention).

So synchronizing is good, and optimal in bandwith consumption and response times. (The CPU overhead of synchronization is vanishingly small compared to I/O waits) - assuming that a current version of the resource might not be available when doGet() is invoked; if your server always had a current version of the resource, it could send it back without delay. (You might have a background thread download the new version just before the old one expires.)

You haven't shown any error handling. You'll have to decide whether to propagate exceptions thrown by downloadNewVersionOfResource() to your callers or continue to serve the old version of the resource.

Edit

So? Let's assume you have 100 connection workers, and the check whether the resource is stale takes one microsecond, the resource is not stale, and serving it takes one second. Then, on average, 100 * 10^-6 / 1 = 0.0001 threads are trying to get the lock. Barely any contention at all. And the overhead of acquiring an untaken lock is on the order of 10^-8 seconds. There is no point optimizing things that already take microsends, when the network will cause delays of milliseonds. And if you don't believe me, do a microbenchmark for synchronization. It is true that frequent, needless sychronization adds significant overhead, and that synchronizing collection classes were deprecated for that reason. But that's because these methods do very little work per invocation, and the relative overhead of sychronization was a lot bigger. I just did a small microbenchmark for the following code:

synchronized (lock) {
    c++;
}

On my notebook, this takes 50 nanoseconds (5*10^-8 seconds) averaged over 10 million executions in sun's hotspot vm. This is about 20 times as long as the naked increment operation, so if one does a lot of increments, sychronizing each of those will slow the program an order of magnitude. If however, that method did blocking I/O, waiting, say, 1 ms, adding those same 50 nanoseconds would reduce throughput by 0.005%. Surely you have better opportunities for performance tuning :-)

This is why, you should always measure before starting to optimize. It prevents you from investing hours of your time to save a couple nano seconds processor time.

There might be possibility that you can reduce lock contention (thus improving through-put) by using "lock-striping" -- essentially, split one lock into several ones, each lock protecting particular group of users.
The tricky part is how to figure out how to assign users to groups. The simplest case is when you can assign request from any user to any group. If your data model requires that requests from one user must be processed sequentially, you must introduce some mapping between user requests and groups. Here's sample implementation of StripedLock:

import java.util.concurrent.locks.ReentrantLock;

/**
 * Striped locks holder, contains array of {@link java.util.concurrent.locks.ReentrantLock}, on which lock/unlock
 * operations are performed. Purpose of this is to decrease lock contention.
 * <p>When client requests lock, it gives an integer argument, from which target lock is derived as follows:
 * index of lock in array equals to <code>id & (locks.length - 1)</code>.
 * Since <code>locks.length</code> is the power of 2, <code>locks.length - 1</code> is string of '1' bits,
 * and this means that all lower bits of argument are taken into account.
 * <p>Number of locks it can hold is bounded: it can be from set {2, 4, 8, 16, 32, 64}.
  */
public class StripedLock {
    private final ReentrantLock[] locks;

    /**
     * Default ctor, creates 16 locks
     */
    public StripedLock() {
        this(4);
    }

    /**
     * Creates array of locks, size of array may be any from set {2, 4, 8, 16, 32, 64} 
     * @param storagePower size of array will be equal to <code>Math.pow(2, storagePower)</code>
     */
    public StripedLock(int storagePower) {
        if (storagePower < 1 || storagePower > 6)
             throw new IllegalArgumentException("storage power must be in [1..6]");
        int lockSize = (int) Math.pow(2, storagePower);
        locks = new ReentrantLock[lockSize];
        for (int i = 0; i < locks.length; i++)
            locks[i] = new ReentrantLock();
    }

    /**
     * Locks lock associated with given id.
     * @param id value, from which lock is derived
     */
    public void lock(int id) {
        getLock(id).lock();
    }

    /**
     * Unlocks lock associated with given id.
     * @param id value, from which lock is derived 
     */
    public void unlock(int id) {
        getLock(id).unlock();
    }

    /**
     * Map function between integer and lock from locks array
     * @param id argument
     * @return lock which is result of function 
     */
    private ReentrantLock getLock(int id) {
        return locks[id & (locks.length - 1)];
    }
}

继续阅读：locking multithreading servlets synchronization

Efficient cache synchronization

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？