Thread-safe asynchronous code in C#
I asked the question below couple of weeks ago. Now, when reviewing my question and all the answers, a very important detail jumped into my eyes: In my second code example, isn't DoTheCodeThatNeedsToRunAsynchronously()
executed in the main (UI) thread? Doesn't the timer just wait a second and then post an event to the main thread? This would mean then that the code-that-needs-to-run-asynchronously isn't run asynchronously at all?!
Original question:
I have recently faced a problem multiple times and solved it in different ways, always being uncertain on whether it is thread safe or not: I need to execute a piece of C# code asynchronously. (Edit: I forgot to mention I'm using .NET 3.5!)
That piece of code works on an object that is provided by the main thread code. (Edit: Let's assume that object is thread-safe in itself.) I'll present you two ways I tried (simplified) and have these four questions:
- What is the best way to achieve what I want? Is it one of the two or another approach?
- Is one of the two ways not thread-safe (I fear both...) and why?
- The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object?
- The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the refer开发者_如何学Goence in the variable changes before it is evaluated by the delegate code? (This is a very generic question whenever one uses anonymous delegates). In Java you are forced to declare the local variable as final (i.e. it cannot be changed once assigned). In C# there is no such possibility, is there?
Approach 1: Thread
new Thread(new ParameterizedThreadStart(
delegate(object parameter)
{
Thread.Sleep(1000); // wait a second (for a specific reason)
MyObject myObject = (MyObject)parameter;
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
})).Start(this.MyObject);
There is one problem I had with this approach: My main thread might crash, but the process still persists in the memory due to the zombie thread.
Approach 2: Timer
MyObject myObject = this.MyObject;
System.Timers.Timer timer = new System.Timers.Timer();
timer.Interval = 1000;
timer.AutoReset = false; // i.e. only run the timer once.
timer.Elapsed += new System.Timers.ElapsedEventHandler(
delegate(object sender, System.Timers.ElapsedEventArgs e)
{
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
DoSomeStuff();
myObject = that.MyObject; // hypothetical second assignment.
The local variable myObject
is what I'm talking about in question 4. I've added a second assignment as an example. Imagine the timer elapses after the second assigment, will the delegate code operate on this.MyObject
or that.MyObject
?
Whether or not either of these pieces of code is safe has to do with the structure of MyObject
instances. In both cases you are sharing the myObject
variable between the foreground and background threads. There is nothing stopping the foreground thread from modifying myObject
while the background thread is running.
This may or may not be safe and depends on the structure of MyObject
. However if you haven't specifically planned for it then it's most certainly an unsafe operation.
I recommend using Task
objects, and restructuring the code so that the background task returns its calculated value rather than changing some shared state.
I have a blog entry that discusses five different approaches to background tasks (Task
, BackgroundWorker
, Delegate.BeginInvoke
, ThreadPool.QueueUserWorkItem
, and Thread
), with the pros and cons of each.
To answer your questions specifically:
- What is the best way to achieve what I want? Is it one of the two or another approach? The best solution is to use the
Task
object instead of a specificThread
or timer callback. See my blog post for all the reasons why, but in summary:Task
supports returning a result, callbacks on completion, proper error handling, and integration with the universal cancellation system in .NET. - Is one of the two ways not thread-safe (I fear both...) and why? As others have stated, this totally depends on whether
MyObject.ChangeSomeProperty
is threadsafe. When dealing with asynchronous systems, it's easier to reason about threadsafety when each asynchronous operation does not change shared state, and rather returns a result. - The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object? Personally, I prefer using lambda binding, which is more type-safe (no casting necessary).
- The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? Lambdas (and delegate expressions) bind to variables, not to values, so the answer is yes: the reference may change before it is used by the delegate. If the reference may change, then the usual solution is to create a separate local variable that is only used by the lambda expression,
as such:
MyObject myObject = this.MyObject;
...
timer.AutoReset = false; // i.e. only run the timer once.
var localMyObject = myObject; // copy for lambda
timer.Elapsed += new System.Timers.ElapsedEventHandler(
delegate(object sender, System.Timers.ElapsedEventArgs e)
{
DoTheCodeThatNeedsToRunAsynchronously();
localMyObject.ChangeSomeProperty();
});
// Now myObject can change without affecting timer.Elapsed
Tools like ReSharper will try to detect whether local variables bound in lambdas may change, and will warn you if it detects this situation.
My recommended solution (using Task
) would look something like this:
var ui = TaskScheduler.FromCurrentSynchronizationContext();
var localMyObject = this.myObject;
Task.Factory.StartNew(() =>
{
// Run asynchronously on a ThreadPool thread.
Thread.Sleep(1000); // TODO: review if you *really* need this
return DoTheCodeThatNeedsToRunAsynchronously();
}).ContinueWith(task =>
{
// Run on the UI thread when the ThreadPool thread returns a result.
if (task.IsFaulted)
{
// Do some error handling with task.Exception
}
else
{
localMyObject.ChangeSomeProperty(task.Result);
}
}, ui);
Note that since the UI thread is the one calling MyObject.ChangeSomeProperty
, that method doesn't have to be threadsafe. Of course, DoTheCodeThatNeedsToRunAsynchronously
still does need to be threadsafe.
"Thread-safe" is a tricky beast. With both of your approches, the problem is that the "MyObject" your thread is using may be modified/read by multiple threads in a way that makes the state appear inconsistent, or makes your thread behave in a way inconsistent with actual state.
For example, say your MyObject.ChangeSomeproperty()
MUST be called before MyObject.DoSomethingElse()
, or it throws. With either of your approaches, there is nothing to stop any other thread from calling DoSomethingElse()
before the thread that will call ChangeSomeProperty()
finishes.
Or, if ChangeSomeProperty()
happens to be called by two threads, and it (internally) changes state, the thread context switch may happen while the first thread is in the middle of it's work and the end result is that the actual new state after both threads is "wrong".
However, by itself, neither of your approaches is inherently thread-unsafe, they just need to make sure that changing state is serialized and that accessing state is always giving a consistent result.
Personally, I wouldn't use the second approach. If you're having problems with "zombie" threads, set IsBackground
to true on the thread.
Your first attempt is pretty good, but the thread continued to exist even after the application exits, because you didn't set the IsBackground
property to true
... here is a simplified (and improved) version of your code:
MyObject myObject = this.MyObject;
Thread t = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
t.IsBackground = true;
t.Start();
With regards to the thread safety: it's difficult to tell if your program functions correctly when multiple threads execute simultaneously, because you're not showing us any points of contention in your example. It's very possible that you will experience concurrency issues if your program has contention on MyObject
.
Java has the final
keyword and C# has a corresponding keyword called readonly
, but neither final
nor readonly
ensure that the state of the object you're modifying will be consistent between threads. The only thing these keywords do is ensure that you do not change the reference the object is pointing to. If two threads have read/write contention on the same object, then you should perform some type of synchronization or atomic operations on that object in order to ensure thread safety.
Update
OK, if you modify the reference to which myObject
is pointing to, then your contention is now on myObject
. I'm sure that my answer will not match your actual situation 100%, but given the example code you've provided I can tell you what will happen:
You will not be guaranteed which object gets modified: it can be that.MyObject
or this.MyObject
. That's true regardless if you're working with Java or C#. The scheduler may schedule your thread/timer to be executed before, after or during the second assignment. If you're counting on a specific order of execution, then you have to do something to ensure that order of execution. Usually that something is a communication between the threads in the form of a signal: a ManualResetEvent
, Join
or something else.
Here is a join example:
MyObject myObject = this.MyObject;
Thread task = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
});
task.IsBackground = true;
task.Start();
task.Join(); // blocks the main thread until the task thread is finished
myObject = that.MyObject; // the assignment will happen after the task is complete
Here is a ManualResetEvent
example:
ManualResetEvent done = new ManualResetEvent(false);
MyObject myObject = this.MyObject;
Thread task = new Thread(()=>
{
Thread.Sleep(1000); // wait a second (for a specific reason)
DoTheCodeThatNeedsToRunAsynchronously();
myObject.ChangeSomeProperty();
done.Set();
});
task.IsBackground = true;
task.Start();
done.WaitOne(); // blocks the main thread until the task thread signals it's done
myObject = that.MyObject; // the assignment will happen after the task is done
Of course, in this case it's pointless to even spawn multiple threads, since you're not going to allow them to run concurrently. One way to avoid this is by not changing the reference to myObject
after you've started the thread, then you won't need to Join
or WaitOne
on the ManualResetEvent
.
So this leads me to a question: why are you assigning a new object to myObject
? Is this a part of a for-loop which is starting multiple threads to perform multiple asynchronous tasks?
What is the best way to achieve what I want? Is it one of the two or another approach?
Both look fine, but...
Is one of the two ways not thread-safe (I fear both...) and why?
...they are not thread safe unless MyObject.ChangeSomeProperty()
is thread safe.
The first approach creates a thread and passes it the object in the constructor. Is that how I'm supposed to pass the object?
Yes. Using a closure (as in your second approach) is fine as well, with the additional advantage that you don't need to do a cast.
The second approach uses a timer which doesn't provide that possibility, so I just use the local variable in the anonymous delegate. Is that safe or is it possible in theory that the reference in the variable changes before it is evaluated by the delegate code? (This is a very generic question whenever one uses anonymous delegates).
Sure, if you add myObject = null;
directly after setting timer.Elapsed
, then the code in your thread will fail. But why would you want to do that? Note that changing this.MyObject
will not affect the variable captured in your thread.
So, how to make this thread-safe? The problem is that myObject.ChangeSomeProperty();
might run in parallel with some other code that modifies the state of myObject
. There are basically two solutions to that:
Option 1: Execute myObject.ChangeSomeProperty()
in the main UI thead. This is the simplest solution if ChangeSomeProperty
is fast. You can use the Dispatcher
(WPF) or Control.Invoke
(WinForms) to jump back to the UI thread, but the easiest way is to use a BackgroundWorker
:
MyObject myObject = this.MyObject;
var bw = new BackgroundWorker();
bw.DoWork += (sender, args) => {
// this will happen in a separate thread
Thread.Sleep(1000);
DoTheCodeThatNeedsToRunAsynchronously();
}
bw.RunWorkerCompleted += (sender, args) => {
// We are back in the UI thread here.
if (args.Error != null) // if an exception occurred during DoWork,
MessageBox.Show(args.Error.ToString()); // do your error handling here
else
myObject.ChangeSomeProperty();
}
bw.RunWorkerAsync(); // start the background worker
Option 2: Make the code in ChangeSomeProperty()
thread-safe by using the lock
keyword (inside ChangeSomeProperty
as well as inside any other method modifying or reading the same backing field).
The bigger thread-safety concern here, in my mind, may be the 1 second Sleep. If this is required in order to synchronize with some other operation (giving it time to complete), then I strongly recommend using a proper synchronization pattern rather than relying on the Sleep. Monitor.Pulse or AutoResetEvent are two common ways to achieve synchronization. Both should be used carefully, as it's easy to introduce subtle race conditions. However, using Sleep for synchronization is a race condition waiting to happen.
Also, if you want to use a thread (and don't have access to the Task Parallel Library in .NET 4.0), then ThreadPool.QueueUserWorkItem is preferable for short-running tasks. The thread pool threads also won't hang up the application if it dies, as long as there is not some deadlock preventing a non-background thread from dying.
One thing not mentioned so far: The choice of threading methods depends heavily on specifically what DoTheCodeThatNeedsToRunAsynchronously()
does.
Different .NET threading approaches are suitable for different requirements. One very large concern is whether this method will complete quickly, or take some time (is it short-lived or long-running?).
Some .NET threading mechanisms, like ThreadPool.QueueUserWorkItem()
, are for use by short-lived threads. They avoid the expense of creating a thread by using "recycled" threads--but the number of threads it will recycle is limited, so a long-running task shouldn't hog the ThreadPool's threads.
Other options to consider are using:
ThreadPool.QueueUserWorkItem()
is a convienient means to fire-and-forget small tasks on a ThreadPool threadSystem.Threading.Tasks.Task
is a new feature in .NET 4 which makes small tasks easy to run in async/parallel mode.Delegate.BeginInvoke()
andDelegate.EndInvoke()
(BeginInvoke()
will run the code asynchronously, but it's crucial that you ensureEndInvoke()
is called as well to avoid potential resource-leaks. It's also based onThreadPool
threads I believe.System.Threading.Thread
as shown in your example. Threads provide the most control but are also more expensive than the other methods--so they are ideal for long-running tasks or detail-oriented multithreading.
Overall my personal preference has been to use Delegate.BeginInvoke()/EndInvoke()
-- it seems to strike a good balance between control and ease of use.
精彩评论