How to choose tasks from a list based on some associated meta data?
I have n tasks in a waiting list. Each task has associated with it an entry that contains some meta information:
Task1 A,B
Task2 A
Task3 B,C
Task4 A,B,C
And an asssociated hashmap that contains entries like:
A 1
B 2
C 2
This implies that if a task, that contains in its meta information A, is already running, then no other task containing A can run at the same time. However, since B has a limit of 2 tasks, so either task1 and task3 can run together, or task3 and task4. But task1, task3 and task4 cannot run together since both the limits of A and B will be violated, though limit of C is not violated.
If I need to select tasks to run in different threads, what logic/algorithm would you suggest? And, when should this logic be invoked? I view the task list as a shared resource which might need to be locked when tasks are selected to run from it. Right now, I think this logic might have to be invoked when a task is added to the list and also, when a running task has completed. But this could block the addition of new elements to the list, unless I make a copy of the list before running the logic.
How would your logic change if I were to give higher priority to tasks that contain more entries like 'A,B,C' than that to 'A,B'?
This is kind of a continuation of Choosing a data structure 开发者_Go百科for a variant of producer consumer problem and How to access the underlying queue of a ThreadpoolExecutor in a thread safe way, just in case any one is wondering about the background of the problem.
Yes, this is nasty. I immediately thought of an array/list of semaphores, initialized from the hashmap from which any thread attempting to execute a task would have to get units as defined by the metadata. About a second later, I realized that such a design would deadlock pretty quick!
I think that one dedicated producer thread is going to have to iterate a 'readyJobs' list in an attempt to find a task that can execute with the current resources avaliable. It could do this both when new tasks become available and after a task is completed, so releasing resources. The producer thread could wait on one input queue, (thread-safe producer-consumer queue), to which is queued both new tasks from [wherever] and completed tasks that are queued back from the work threads, (callback fired by the work threads pushes the completed task to the input queue?). Adding new tasks might be blocked briefly, but only while the input queue is blocked by some other task being added.
In the case of assigning 'priorites', you could insert-sort the 'readyJobs' list as you wish, so that higher-priority tasks are checked first to see if they can run with the resources available. If they cannot, then the rest of the list is iterated and a lower-priority job might be able to run.
I hope that you do not want to 'preempt' lower-priority tasks so as to release resources early - that would get really, really messy :(
Rgds, Martin
精彩评论