Clarification of Read and Write on a C# Dictionary
In the context of this statement,
A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration.开发者_开发技巧 To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.
What does read and write mean? My understanding is that a read is an operation which looks up a key and provides a reference to it's value and a write is an operation which adds or removes a key value pair from the dictionary. However, I can't find anything conclusive that regarding this.
So the big question is, while implementing a thread safe dictionary, would an operation that updates the value for an existing key in the dictionary be consider a reader or writer? I plan to have multiple threads accessing unique keys in a dictionary and modifying their values, but the threads will not add/remove new keys.
The obvious implication, assuming modifying an existing value is not a write operation on the dictionary, is that my implementation of the thread safe dictionay can be a lot more efficient, as I would not need to get an exclusive lock every time I try to update the value to an existing key.
Usage of ConcurrentDictionary from .Net 4.0 is not an option.
A major point not yet mentioned is that if TValue
is a class type, the things held by a Dictionary<TKey,TValue>
will be the identities of TValue
objects. If one receives a reference from the dictionary, the dictionary will neither know nor care about anything one may do with the object referred to thereby.
One useful little utility class in cases where all of the keys associated with a dictionary will be known in advance of code that needs to use it is:
class MutableValueHolder<T>
{
public T Value;
}
If one wants to have multi-threaded code count how many times various strings appear in a bunch of files, and one knows in advance all the strings of interest, one may then use something like a Dictionary<string, MutableValueHolder<int>>
for that purpose. Once the dictionary is loaded with all the proper strings and a MutableValueHolder<int>
instance for each one, then any number of threads may retrieve references to MutableValueHolder<int>
objects, and use Threading.Interlocked.Increment
or other such methods to modify the Value
associated with each one, without having to write to the Dictionary at all.
overwriting an existing value should be treated as a write operation
Anything that can affect the results of another read should be considered a write.
Changing a key is most definitly a write since it will cause the item to move in the internal hash or index or however dictionaries do their O(log(n)) stuff...
What you might want to do is look at ReaderWriterLock
http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlock.aspx
Updating a value is conceptually a write operation. When updating a value with concurrent access where a read is performed before a write is completed, you read out an old value. When two writes conflict the wrong value may be stored.
Adding a new value could trigger a grow of the underlying storage. In this case new memory is allocated, all elements are copied into the new memory, the new element is added, the dictionary object is updated to refer to the new memory location for storage and the old memory is released and available for garbage collection. During this time, more writes could cause a big problem. Two writes at the same time could trigger two instances of this memory copying. If you follow through the logic, you'll see an element will get lost since only the last thread to update the reference will know about existing items and not the other items that were trying to be added.
ICollection provides a member to synchronize access and the reference remains valid across grow/shrink operations.
A read operation is anything that gets a key or value from a Dictionary
, a write operation is anything that updates or adds a key or a value. So a process updating a key would be considered to be a writer.
A simple way to make a thread safe dictionary is to create your own implementation of IDictionary
that simply locks a mutex and then forwards the call to an implementation:
public class MyThreadSafeDictionary<T, J> : IDictionary<T, J>
{
private object mutex = new object();
private IDictionary<T, J> impl;
public MyThreadSafeDictionary(IDictionary<T, J> impl)
{
this.impl = impl;
}
public void Add(T key, J value)
{
lock(mutex) {
impl.Add(key, value);
}
}
// implement the other methods as for Add
}
You could replace the mutex with a reader-writer lock if you are having some threads only read the dictionary.
Also note that Dictionary
objects don't support changing keys; the only safe way to achieve want you want is to remove the existing key/value pair and add a new one with the updated key.
Modifying a value is a write and introduces a race condition.
Let's say the original value of mydict[5] = 42. One thread updates mydict[5] to be 112. Another thread updates mydict[5] to be 837.
What should the value of mydict[5] be at the end? The order of the threads is important in this case, meaning either you need to make sure the order is explicit or that they don't write.
精彩评论