Converting single-threaded legacy code with global variables to multithreaded code using thread-local storage
I have a code-base of legacy C/C++ code, which contains lots of functions that access global static variables, and are therefore not thread-safe.开发者_运维问答 I'm looking for advice on how to convert this code to make it thread safe. It occurs to me that one way to do it would be to convert the static variables into thread-local variables, or otherwise store them in thread-local storage. This has the advantage that I don't have to rewrite a lot of code that uses the functions to pass extra context to them, just the thread-unsafe functions themselves. But in researching this, I haven't found a lot of advice on whether this is a good or bad idea. Some specific concerns I have are
- will the accessing of TLS-based data be significantly slower?
- am I just continuing to fall into the trap of using global variables, since "global variables are bad", or does TLS counteract the global-variables-are-bad argument?
Any other thoughts would be appreciated too.
First of all, it's best to determine which global variables are accessed in a mutable fashion and which are not. There could be several cases where the variables are not actually modified once set, just that the initialization of those variables will now pose problems for a non-deterministic program. Those variables, where initialization order will matter, I'd place in their own class which could be propagated to each individual thread which needs them.
After that, I'm sure your problems will only compound themselves. I've been where you are and my sympathies.
The problem with using TLS is that you force thread affinity onto your clients, and they have to be aware of this and write their code to handle it accordingly. So if they have a worker thread that does the heavy lifting then they will have to marshal ALL calls to your library onto that worker thread, even for simple getters and setters. Or if some other component calls into their component on a thread pool thread (for instance) then they will have to marshal that call onto their worker thread. It's not the worst thing in the world, but it can be very inconvenient. I definitely prefer the handle/context pattern (I maintain a library that uses TLS and it has been a chore at times dealing with thread affinity).
Well, it'd be pretty easy to start there. You could just have a conditional compile for thread-local. Some compilers e.g. Visual Studio have their own thread_local storage that you can just drop in.
In terms of correctness I see no reason that this wouldn't work, as long as the threading doesn't have other implications that could cause conflicts. You'd need to make sure you caught all the globals and converted them.
The best way to determine if it's significantly slower would be to profile a converted section of the code and see.
That being said I don't know that this is the best solution. Global variables are just that, regardless of what they're called, and they make tracking program state and debugging a total nightmare. I would seriously suggest taking a long look at the code and refactoring one piece at a time into thread-safe code that doesn't rely on globals. You'll have a much easier time debugging and future maintainers (or yourself) will thank you several years from now.
Without understanding the semantics of every variable that you plan to convert, simply automating the conversion from global to TLS is likely to be a huge unproductive timesink. Some of them may need to remain global and be handled in a thread-safe way - others may be quite safe to use as TLS. Still more may be thread-safe as TLS, but result in a bad unscalable design as you increase the number of threads (eg. one socket per client/thread in a previously single-client single-threaded sockets server).
What is your platform/compiler of choice here btw?
Rather than storing all of the variables in thread-local storage, I would think it better to put them into a class; if the original code was mostly C, one may be able to wrap the C code in a class which can then have the "global" variables as fields. Different instances of the class running in different threads would naturally see different versions of those variables.
精彩评论