开发者

Thread Safety General Rules

A few questions about thread safety that I think I understand, but would like clarification on, if you could be so kind. The specific languages I program in are C++, C#, and Java. Hopefully keep these in mind when describing specific language keywords/features.

1) Cases of 1 writer, n readers. In cases such as n threads reading a variable, such as in a polled loop, and 1 writer updating this variable, is explicit locking required?

Consider:

// thread 1.
volatile bool bWorking = true;

void stopWork() { bWorking = false; }

// thread n
while (bWorking) {...}

Here, should it be enough to just have a memory barrier, and accomplish this with volatile? Since as I understand, in my above mentioned languages, simple reads and writes to primitives will not be interleaved so explicit locking is not required, however memory consistency cannot be guaranteed without some explicit lock, or volatile. Are my assumptions correct here?

2) Assuming my assumption above is correct, then it is only correct for simple reads and writes. That is bWorking = x... and x = bWorking; are the ONLY safe operations? IE开发者_如何学Go complex assignments such as unary operators (++, --) are unsafe here, as are +=, *=, etc... ?

3) I assume if case 1 is correct, then it is not safe to expand that statement to also be safe for n writers and n readers when only assignment and reading is involved?


For Java:

1) a volatile variable is updated from/to the "main memory" on each reading writing, which means that the change by the updater thread will be seen by all reading threads on their next read. Also, updates are atomic (independent of variable type).

2) Yes, combined operations like ++ are not thread safe if you have multiple writers. For a single writing thread, there is no problem. (The volatile keyword makes sure that the update is seen by the other threads.)

3) As long as you only assign and read, volatile is enough - but if you have multiple writers, you can't be sure which value is the "final" one, or which will be read by which thread. Even the writing threads themselves can't reliably know that their own value is set. (If you only have boolean and will only set from true to false, there is no problem here.)

If you want more control, have a look at the classes in the java.util.concurrent.atomic package.


Do the locking. You are going to need to have locking anyway if you are writing multi-threaded code. C# and Java make it fairly simple. C++ is a little more complex but you should be able to use boost or make your own RAII classes. Given that you are going to be locking all over the place don't try to see if there are a few places where you might be able to avoid it. All will work fine until you run the code on a 64-way processor using new INtel microcode on a Tuesday in march on some misison critical customer system. Then bang.

People think that locks are expensive; they really aren't. The kernel devs spend a lot of time optimizing them and compared to one disk read they are utterly trivial; yet nobody ever seems to expend this much effort analyzing every last disk read

Add the usual statements about performance tuning evils, wise saying from Knuth, Spolsky ...... etc, etc,


For C++

1) This is tempting to try, and will usually work. However, a few things to keep in mind:

You're doing it with a boolean, so that seems safest. Other POD types might nor be so safe. E.g. it may take two instructions to set a 64-bit double on a 32-bit machine. So that would clearly be not thread safe.

If the boolean is the only thing you care about the threads sharing, this could work. If you're using it as a variant of the Double-Checked Lock Paradigm, you run into all the pitfalls therein. Consider:

std::string failure_message;  // shared across threads

// some thread triggers the stop, and also reports why
failure_message = "File not found";
stopWork();

// all the other threads
while (bWorking) {...}
log << "Stopped work:  " << failure_message;

This looks ok at first, because failure_message is set before bWorking is set to false. However, that may not be the case in practice. The compiler can rearrange the statements, and set bWorking first, resulting in thread unsafe access of failure_message. Even if the compiler doesn't, the hardware might. Multi-core cpus have their own caches, and thus things aren't quite so simple.

If it's just a boolean, it's probably ok. If it's more than that, it might have issues once in a while. How important is the code you're writing, and can you take that risk?

2) Correct, ++/--, +=, other operators will take multiple cpu instructions and will be thread unsafe. Depending on your platform and compiler, you may be able to write non-portable code to do atomic increments.

3) Correct, this would be unsafe in a general case. You can kinda squeak by when you have one thread, writing a single boolean once. As soon as you introduce multiple writes, you'd better have some real thread synchronization.

Note about cpu instructions

If an operation takes multiple instructions, your thread could be preempted between them -- and the operation would be partially complete. This is clearly bad for thread safety, and this is one reason why ++, +=, etc are not thread safe.

However, even if an operation takes a single instruction, that does not necessarily mean that it's thread safe. With multi-core and multi-cpu you have to worry about the visibility of a change -- when is the cpu cache flushed to main memory.

So while multiple instructions does imply not thread safe, it's false to assume that single instruction implies thread safe


With a 1-byte bool, you might be able to get away without using locking, but since you cannot guarantee the internals of the processor it'd still be a bad idea. Certainly with anything beyond 1 byte such as an integer you couldn't. One processor could be updating it while another was reading it on another thread, and you could get inconsistent results. In C# I would use a lock { } statement around the access (read or write) to bWorking. If it was something more complex, for example IO access to a large memory buffer, I'd use ReaderWriterLock or some variant of it. In C++ volatile won't help much, because that just prevents certain kinds of optimizations such as register variables which would totally cause problems in multithreading. You still need to use a locking construct.

So in summary I would never read and write anything in a multithreaded program without locking it somehow.


  1. Updating a bool is going to be atomic on any sensible extant system. However, once your writer has written, there's no telling how long before your reader will read, especially once you take into account multiple cores, caches, scheduler oddities, and so on.

  2. Part of the problem with increments and decrements (++, --) and compound assignments (+=, *=) is that they are misleading. They imply something is happening atomically that is actually happening in several operations. But even simple assignments can be unsafe one you have stepped away from the purity of boolean variables. Guaranteeing that a write as simple as x=foo is atomic is up to the details of your platform.

  3. I assume by thread safe, you mean that readers will always see a consistent object no matter what the writers do. In your example this will always be the case since booleans can only evaluate to two values, both valid, and the value is only transitions once from true to false. Thread safety is going to be more difficult in a more complicated scenario.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜