开发者

how to pass data to running thread

When using pthread, I can pass data at th开发者_JAVA百科read creation time.

What is the proper way of passing new data to an already running thread?

I'm considering making a global variable and make my thread read from that.

Thanks


That will certainly work. Basically, threads are just lightweight processes that share the same memory space. Global variables, being in that memory space, are available to every thread.

The trick is not with the readers so much as the writers. If you have a simple chunk of global memory, like an int, then assigning to that int will probably be safe. Bt consider something a little more complicated, like a struct. Just to be definite, let's say we have

struct S { int a; float b; } s1, s2;

Now s1,s2 are variables of type struct S. We can initialize them

s1 = { 42,  3.14f };

and we can assign them

s2 = s1;

But when we assign them the processor isn't guaranteed to complete the assignment to the whole struct in one step -- we say it's not atomic. So let's now imagine two threads:

thread 1:
   while (true){
      printf("{%d,%f}\n", s2.a, s2.b );
      sleep(1);
   }

thread 2:
   while(true){
      sleep(1);
      s2 = s1;
      s1.a += 1;
      s1.b += 3.14f ;
   }

We can see that we'd expect s2 to have the values {42, 3.14}, {43, 6.28}, {44, 9.42} ....

But what we see printed might be anything like

 {42,3.14}
 {43,3.14}
 {43,6.28}

or

 {43,3.14}
 {44,6.28}

and so on. The problem is that thread 1 may get control and "look at" s2 at any time during that assignment.

The moral is that while global memory is a perfectly workable way to do it, you need to take into account the possibility that your threads will cross over one another. There are several solutions to this, with the basic one being to use semaphores. A semaphore has two operations, confusingly named from Dutch as P and V.

P simply waits until a variable is 0 and the goes on, adding 1 to the variable; V subtracts 1 from the variable. The only thing special is that they do this atomically -- they can't be interrupted.

Now, do you code as

thread 1:
   while (true){
      P();
      printf("{%d,%f}\n", s2.a, s2.b );
      V();
      sleep(1);
   }

thread 2:
   while(true){
      sleep(1);
      P();
      s2 = s1;
      V();
      s1.a += 1;
      s1.b += 3.14f ;
   }

and you're guaranteed that you'll never have thread 2 half-completing an assignment while thread 1 is trying to print.

(Pthreads has semaphores, by the way.)


I have been using the message-passing, producer-consumer queue-based, comms mechanism, as suggested by asveikau, for decades without any problems specifically related to multiThreading. There are some advantages:

1) The 'threadCommsClass' instances passed on the queue can often contain everything required for the thread to do its work - member/s for input data, member/s for output data, methods for the thread to call to do the work, somewhere to put any error/exception messages and a 'returnToSender(this)' event to call so returning everything to the requester by some thread-safe means that the worker thread does not need to know about. The worker thread then runs asynchronously on one set of fully encapsulated data that requires no locking. 'returnToSender(this)' might queue the object onto a another P-C queue, it might PostMessage it to a GUI thread, it might release the object back to a pool or just dispose() it. Whatever it does, the worker thread does not need to know about it.

2) There is no need for the requesting thread to know anything about which thread did the work - all the requestor needs is a queue to push on. In an extreme case, the worker thread on the other end of the queue might serialize the data and communicate it to another machine over a network, only calling returnToSender(this) when a network reply is received - the requestor does not need to know this detail - only that the work has been done.

3) It is usually possible to arrange for the 'threadCommsClass' instances and the queues to outlive both the requester thread and the worker thread. This greatly eases those problems when the requester or worker are terminated and dispose()'d before the other - since they share no data directly, there can be no AV/whatever. This also blows away all those 'I can't stop my work thread because it's stuck on a blocking API' issues - why bother stopping it if it can be just orphaned and left to die with no possibility of writing to something that is freed?

4) A threadpool reduces to a one-line for loop that creates several work threads and passes them the same input queue.

5) Locking is restricted to the queues. The more mutexes, condVars, critical-sections and other synchro locks there are in an app, the more difficult it is to control it all and the greater the chance of of an intermittent deadlock that is a nightmare to debug. With queued messages, (ideally), only the queue class has locks. The queue class must work 100% with mutiple producers/consumers, but that's one class, not an app full of uncooordinated locking, (yech!).

6) A threadCommsClass can be raised anytime, anywhere, in any thread and pushed onto a queue. It's not even necessary for the requester code to do it directly, eg. a call to a logger class method, 'myLogger.logString("Operation completed successfully");' could copy the string into a comms object, queue it up to the thread that performs the log write and return 'immediately'. It is then up to the logger class thread to handle the log data when it dequeues it - it may write it to a log file, it may find after a minute that the log file is unreachable because of a network problem. It may decide that the log file is too big, archive it and start another one. It may write the string to disk and then PostMessage the threadCommsClass instance on to a GUI thread for display in a terminal window, whatever. It doesn't matter to the log requesting thread, which just carries on, as do any other threads that have called for logging, without significant impact on performance.

7) If you do need to kill of a thread waiting on a queue, rather than waiing for the OS to kill it on app close, just queue it a message telling it to teminate.

There are surely disadvantages:

1) Shoving data directly into thread members, signaling it to run and waiting for it to finish is easier to understand and will be faster, assuming that the thread does not have to be created each time.

2) Truly asynchronous operation, where the thread is queued some work and, sometime later, returns it by calling some event handler that has to communicate the results back, is more difficult to handle for developers used to single-threaded code and often requires state-machine type design where context data must be sent in the threadCommsClass so that the correct actions can be taken when the results come back. If there is the occasional case where the requestor just has to wait, it can send an event in the threadCommsClass that gets signaled by the returnToSender method, but this is obviously more complex than simply waiting on some thread handle for completion.

Whatever design is used, forget the simple global variables as other posters have said. There is a case for some global types in thread comms - one I use very often is a thread-safe pool of threadCommsClass instances, (this is just a queue that gets pre-filled with objects). Any thread that wishes to communicate has to get a threadCommsClass instance from the pool, load it up and queue it off. When the comms is done, the last thread to use it releases it back to the pool. This approach prevents runaway new(), and allows me to easily monitor the pool level during testing without any complex memory-managers, (I usually dump the pool level to a status bar every second with a timer). Leaking objects, (level goes down), and double-released objects, (level goes up), are easily detected and so get fixed.

MultiThreading can be safe and deliver scaleable, high-performance apps that are almost a pleasure to maintain/enhance, (almost:), but you have to lay off the simple globals - treat them like Tequila - quick and easy high for now but you just know they'll blow your head off tomorrow.

Good luck!

Martin


Global variables are bad to begin with, and even worse with multi-threaded programming. Instead, the creator of the thread should allocate some sort of context object that's passed to pthread_create, which contains whatever buffers, locks, condition variables, queues, etc. are needed for passing information to and from the thread.


You will need to build this yourself. The most typical approach requires some cooperation from the other thread as it would be a bit of a weird interface to "interrupt" a running thread with some data and code to execute on it... That would also have some of the same trickiness as something like POSIX signals or IRQs, both of which it's easy to shoot yourself in the foot while processing, if you haven't carefully thought it through... (Simple example: You can't call malloc inside a signal handler because you might be interrupted in the middle of malloc, so you might crash while accessing malloc's internal data structures which are only partially updated.)

The typical approach is to have your thread creation routine basically be an event loop. You can build a queue structure and pass that as the argument to the thread creation routine. Then other threads can enqueue things and the thread's event loop will dequeue it and process the data. Note this is cleaner than a global variable (or global queue) because it can scale to have multiple of these queues.

You will need some synchronization on that queue data structure. Entire books could be written about how to implement your queue structure's synchronization, but the most simple thing would have a lock and a semaphore. When modifying the queue, threads take a lock. When waiting for something to be dequeued, consumer threads would wait on a semaphore which is incremented by enqueuers. It's also a good idea to implement some mechanism to shut down the consumer thread.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜