Keep 'obvious' lock-retrieve or employ double-checked locking?

2023-02-16 00:48 问答作者：

I suck at formulating questions. I have the following piece of (Java) code (pseudo):

public SomeObject getObject(Identifier someIdentifier) {
    // getUniqueIdentifier retrieves a singleton instance of the identifier object,
    // to prevent two Identifiers that are equals() but not == (reference equals) in the system.
    Identifier singletonInstance = getUniqueIdentifier(someIdentifier);
    synchronized (singletonInstance) {
        SomeObject cached = cache.get(singletonInstance);
        if (cached != null) {
            return cached;
        } else {
            SomeObject newInstance = createSomeObject(singletonInstance);
            cache.put(singletonInstance, newInstance);
            return newInstance;
        }
    }
}

Basically, it makes an identifier 'unique' (reference equals, as in ==), checks a cache, and in case of a cache miss, calls an expensive method (involving calling an external resource and parsing, etc), puts that in the cache, and returns. The synchronized Identifier, in this case, avoids two equals() but not == Identifier objects being used to call t开发者_高级运维he expensive method, which would retrieve the same resource simultaneously.

The above works. I'm just wondering, and probably micro-optimizing, would a rewrite such as the following that employs more naïve cache retrieval and double-checked locking be 'safe' (safe as in threadsafe, void of odd race conditions) and be 'more optimal' (as in a reduction of unneeded locking and threads having to wait for a lock)?

public SomeObject getObject(Identifier someIdentifier) {

    // just check the cache, reference equality is not relevant just yet.
    SomeObject cached = cache.get(someIdentifier);
    if (cached != null) {
        return cached;
    }        

    Identifier singletonInstance = getUniqueIdentifier(someIdentifier);
    synchronized (singletonInstance) {
        // re-check the cache here, in case of a context switch in between the 
        // cache check and the opening of the synchronized block.
        SomeObject cached = cache.get(singletonInstance);
        if (cached != null) {
            return cached;
        } else {
            SomeObject newInstance = createSomeObject(singletonInstance);
            cache.put(singletonInstance, newInstance);
            return newInstance;
        }
    }
}

You could say 'Just test it' or 'Just do a micro-benchmark', but testing multi-threaded bits of code isn't my strong point, and I doubt I'd be able to simulate realistic situations or accurately fake race conditions. Plus it'd take me half a day, whereas writing a SO question only takes me a few minutes :).

You are reinventing Google-Collections/Guava's MapMaker/ComputingMap:

ConcurrentMap<Identifier, SomeObject> cache = new MapMaker().makeComputingMap(new Function<Identifier, SomeObject>() {
  public SomeObject apply(Identifier from) {
    return createSomeObject(from);
  }
};

public SomeObject getObject(Identifier someIdentifier) {
  return cache.get(someIdentifier);
}

Interning is not necessary here as the ComputingMap guarantees a single thread will only attempt to populate if absent and another thread asking for the same item will block and wait for the result. If you remove a key that is in the process of being populated then that thread and any that are currently waiting would still get that result but subsequent requests will start the population again.

If you do need interning, that library provides the excellent Interner class that has both strongly and weakly referenced caching.

synchronized takes up to 2 micro-seconds. Unless you need to cut this further you may be better off with the simplest solution.

BTW You can write

SomeObject cached = cache.get(singletonInstance);
if (cached == null) 
   cache.put(singletonInstance, cached = createSomeObject(singletonInstance));
return cached;

If "cache" is a map (which I suspect it is), then this problem is quite different than a simple double-checked locking problem.

If cache is a plain HashMap, then the problem is actually much worse; i.e. your proposed "double-checked pattern" behaves much worse than a simple reference-based double-checking. In fact, it can lead to ConcurrentModificationExceptions, getting incorrect values, or even an infinite loop.

If it is based on a plain HashMap, I would suggest using a ConcurrentHashMap as the first approach. With a ConcurrentHashMap, there is no explicit locking needed on your part.

public SomeObject getObject(Identifier someIdentifier) {
    // cache is a ConcurrentHashMap

    // just check the cache, reference equality is not relevant just yet.
    SomeObject cached = cache.get(someIdentifier);
    if (cached != null) {
        return cached;
    }        

    Identifier singletonInstance = getUniqueIdentifier(someIdentifier);
    SomeObject newInstance = createSomeObject(singletonInstance);
    SombObject old = cache.putIfAbsent(singletonInstance, newInstance);
    if (old != null) {
        newInstance = old;
    }
    return newInstance;
}

继续阅读：concurrency optimization race-condition

Keep 'obvious' lock-retrieve or employ double-checked locking?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？