Java: Caching of non-volatile variables by different threads
The situation is the following:
- I have an object with lots of setters and getters.
- Instance of this object is created in a one particular thread where all values are set. Initially I create an "empty" object using new statement and only then I call some setters methods based on some complicated legacy logic.
- Only then this object became available to all other threads that use only getters.
The question: Do I have to make all variables of this class volatile or not?
Concerns:
- Creation of a new instance of the object and setting all its values is separated in time.
- But all other threads have no idea about this new instance until all values are set. So other threads shall not have a cache of not fully initialized object. Isn'开发者_JAVA技巧t it?
Note: I am aware about builder pattern, but I cannot apply it there for several other reasons :(
EDITED: As I feel two answers from Mathias and axtavt do not match very well, I would like to add an example:
Let's say we have a foo
class:
class Foo {
public int x=0;
}
and two threads are using it as described above:
// Thread 1 init the value:
Foo f = new Foo();
f.x = 5;
values.add(f); // Publication via thread-safe collection like Vector or Collections.synchronizedList(new ArrayList(...)) or ConcurrentHashMap?.
// Thread 2
if (values.size()>0){
System.out.println(values.get(0).x); // always 5 ?
}
As I understood Mathias, it can print out 0 on some JVM according to JLS. As I understood axtavt it will always print 5.
What is your opinion?
-- Regards, Dmitriy
In this case you need to use safe publication idioms when making your object available to other threads, namely (from Java Concurrency in Practice):
- Initializing an object reference from a static initializer;
- Storing a reference to it into a volatile field or AtomicReference;
- Storing a reference to it into a final field of a properly constructed object; or
- Storing a reference to it into a field that is properly guarded by a lock.
If you use safe publication, you don't need to declare fields volatile
.
However, if you don't use it, declaring fields volatile
(theoretically) won't help, because memory barriers incurred by volatile
are one-side: volatile write can be reordered with non-volatile actions after it.
So, volatile
ensures correctness in the following case:
class Foo {
public int x;
}
volatile Foo foo;
// Thread 1
Foo f = new Foo();
f.x = 42;
foo = f; // Safe publication via volatile reference
// Thread 2
if (foo != null)
System.out.println(foo.x); // Guaranteed to see 42
but don't work in this case:
class Foo {
public volatile int x;
}
Foo foo;
// Thread 1
Foo f = new Foo();
// Volatile doesn't prevent reordering of the following actions!!!
f.x = 42;
foo = f;
// Thread 2
if (foo != null)
System.out.println(foo.x); // NOT guaranteed to see 42,
// since f.x = 42 can happen after foo = f
From the theoretical point of view, in the first sample there is a transitive happens-before relationship
f.x = 42 happens before foo = f happens before read of foo.x
In the second example f.x = 42
and read of foo.x
are not linked by happens-before relationship, therefore they can be executed in any order.
You do not need to declare you field volatile of its value is set before the start
method is called on the threads that read the field.
The reason is that in that case the setting is in a happens-before relation (as defined in the Java Language Specification) with the read in the other thread.
The relevant rules from the JLS are:
- Each action in a thread happens-before every action in that thread that comes later in the program's order
- A call to start on a thread happens-before any action in the started thread.
However, if you start the other threads before setting the field, then you must declare the field volatile. The JLS does not allow you to assume that the thread will not cache the value before it reads it for the first time, even if that may be the case on a particular version of the JVM.
In order to fully understand what's going on I have been reading about the Java Memory Model (JMM). A useful introduction to the JMM can be found in Java Conurrency in Practice.
I think the answer to the question is: yes, in the example given making the members of the object volatile is NOT necessary. However, this implementation is rather brittle as this guarantee depends on the exact ORDER in which things are done and on the Thread-Safety of the Container. A builder pattern would be a much better option.
Why is it guaranteed:
- The thread 1 does all the assignment BEFORE putting the value into the thread safe container.
The add method of the thread safe container must use some synchronization construct like volatile read / write, lock or synchronized(). This guarantees two things:
- Instructions which are in thread 1. before the synchronization will actually be executed before. That is the JVM is not allowed to reorder instructions for optimization purposes with the synchronization instruction. This is called happens-before guarantee.
- All writes which happen before the synchronization in thread 1 will afterwards be visible to all other threads.
The objects are NEVER modified after publication.
However, if the container was not thread safe or the Order of things was changed by somebody not aware of the pattern or the objects are changed accidentally after publication then there are no guarantees anymore. So, following the Builder Pattern, as can be generated by google AutoValue or Freebuilder is much safer.
This article on the topic is also quite good: http://tutorials.jenkov.com/java-concurrency/volatile.html
精彩评论