How to implement Copy-on-Write?
I want to implement a co开发者_Go百科py-on-write on my custom C++ String class, and I wonder how to.
I tried to implement some options, but they all turned out very inefficient.
In a multi-threaded environemnt (which is most of them nowadays) CoW is frequently a huge performance hit rather than a gain. And with careful use of const references, it's not much of a performance gain even in a single threaded environment.
This old DDJ article explains just how bad CoW can be in a multithreaded environment, even if there's only one thread.
Additionally, as other people have pointed out, CoW strings are really tricky to implement, and it's easy to make mistakes. That coupled with their poor performance in threading situations makes me really question their usefulness in general. This becomes even more true once you start using C++11 move construction and move assignment.
But, to answer your question....
Here are a couple of implementation techniques that may help with performance.
First, store the length in the string itself. The length is accessed quite frequently and eliminating the pointer dereference would probably help. I would, just for consistency put the allocated length there too. This will cost you in terms of your string objects being a bit bigger, but the overhead there in space and copying time is very small, especially since these values will then become easier for the compiler to play interesting optimization tricks with.
This leaves you with a string class that looks like this:
class MyString {
...
private:
class Buf {
...
private:
::std::size_t refct_;
char *data_;
};
::std::size_t len_;
::std::size_t alloclen_;
Buf *data_;
};
Now, there are further optimizations you can perform. The Buf class there looks like it doesn't really contain or do much, and this is true. Additionally, it requires allocating both an instance of Buf and a buffer to hold the characters. This seems rather wasteful. So, we'll turn to a common C implementation technique, stretchy buffers:
class MyString {
...
private:
struct Buf {
::std::size_t refct_;
char data_[1];
};
void resizeBufTo(::std::size_t newsize);
void dereferenceBuf();
::std::size_t len_;
::std::size_t alloclen_;
Buf *data_;
};
void MyString::resizeBufTo(::std::size_t newsize)
{
assert((data_ == 0) || (data_->refct_ == 1));
if (newsize != 0) {
// Yes, I'm using C's allocation functions on purpose.
// C++'s new is a poor match for stretchy buffers.
Buf *newbuf = ::std::realloc(data_, sizeof(*newbuf) + (newsize - 1));
if (newbuf == 0) {
throw ::std::bad_alloc();
} else {
data_ = newbuf_;
}
} else { // newsize is 0
if (data_ != 0) {
::std::free(data_);
data_ = 0;
}
}
alloclen_ = newsize;
}
When you do things this way, you can then treat data_->data_
as if it contained alloclen_
bytes instead of just 1.
Keep in mind that in all of these cases you will have to make sure that you either never ever use this in a multi-threaded environment, or that you make sure that refct_
is a type that you have both an atomic increment, and an atomic decrement and test instruction for.
There is an even more advanced optimization technique that involves using a union to store short strings right inside the bits of data that you would use to describe a longer string. But that's even more complex, and I don't think I will feel inclined to edit this to put a simplified example here later, but you never can tell.
I would suggest that if one wants to implement copy-on-write efficiently (for strings or whatever), one should define a wrapper type which will behave as a mutable string, and which will hold both a nullable reference to a mutable string (no other reference to that item will ever exist) and a nullable reference to an "immutable" string (references to which will never exist outside things that won't try to mutate it). Wrappers will always be created with at least one of those references non-null; once the mutable-item reference is ever set to a non-null value (during or after construction) it will forever refer to the same target. Any time both references are non-null, the immutable-item reference will point to a copy of the item that was made some time after the most recent completed mutation (during a mutation, the immutable-item reference may or may not hold a reference to a pre-mutation value).
To read an object, check whether the "mutable-item" reference is non-null. If so, use it. Otherwise, check whether the "immutable-item" reference is non-null. If so, use it. Otherwise, use the "mutable item" reference (which by now will be non-null).
To mutate an object, check whether the "mutable-item" reference is non-null. If not, copy the target of the "immutable item" reference and CompareExchange a reference to the new object into the "mutable item" reference. Then mutate the target of the "mutable item" reference and invalidate the "immutable item" reference.
To clone an object, if the clone is expected to be cloned again before it is mutated, retrieve the value of the "immutable-item" reference. If it is null, make a copy of the "mutable item" target and CompareExchange a reference to that new object into the immutable-item reference. Then create a new wrapper whose "mutable-item" reference is null, and whose "immutable-item" reference is either the retrieved value (if it wasn't null) or the new item (if it was).
To clone an object, if the clone is expected to be mutated before it is cloned, retrieve the value of the "immutable-item" reference. If null, retrieve the "mutable-item" reference. Copy the target of whichever reference was retrieved and create a new wrapper whose "mutable-item" reference points to the new copy, and whose "immutable-item" reference is null.
The two cloning methods will be semantically identical, but picking the wrong one for a given situation will result in an extra copy operation. If one consistently chooses the correct copy operation, one will get most of the benefit of an "aggressive" copy-on-write approach, but with far less threading overhead. Every data holding object (e.g. string) will either be unshared mutable or shared immutable, and no object will ever switch between those states. Consequently, one could if desired eliminate all "threading/synchronization overhead" (replacing the CompareExchange operations with straight stores) provided that no wrapper object is used in more than one thread simultaneously. Two wrapper objects might hold references to the same immutable data holder, but they could be oblivious to each others' existence.
Note that a few more copy operations may be required when using this approach than when using an "aggressive" approach. For example, if a new wrapper is created with a new string, and that wrapper is mutated, and copied six times, the original wrapper would hold references to the original string holder and an immutable one holding a copy of the data. The six copied wrappers would just hold a reference to the immutable string (two strings total, although if the original string were never mutated after the copy was made, an aggressive implementation could get by with one). If the original wrapper were mutated, along with five of the six copies, then all but one of the references to the immutable string would get invalidated. At that point, if the sixth wrapper copy were mutated, an aggressive copy-on-write implementation might realize that it held the only reference to its string, and thus decide a copy was unnecessary. The implementation I describe, however, would create a new mutable copy and abandon the immutable one. Despite the fact that there are some extra copy operations, however, the reduction in threading overhead should in most cases more than offset the cost. If the majority of logical copies that are produced are never mutated, this approach may be more efficient than always making copies of strings.
There's not much to CoW. Basically, you copy when you want to change it, and let anyone who doesn't want to change it keep the reference to the old instance. You'll need reference counting to keep track of who's still referencing the object, and since you're creating a new copy, you need to decrease the count on the 'old' instance. A shortcut would be to not make a copy when that count is one (you're the only reference).
Other than that, there's not much that can be said, unless there's a specific problem you're facing.
You may want to emulate the 'immutable' string that other languages have (Python, C# as far as I know).
The idea is that each string is immutable, thus any work on a string create a new immutable one... or that's the basic idea, to avoid explosion, you would need not to create another if there is a similar one.
template <class T> struct cow {
typedef boost::shared_ptr<T> ptr_t;
ptr_t _data;
ptr_t get()
{
return boost::atomic_load(&_data);
}
void set(ptr_t const& data)
{
boost::atomic_store(&_data, data);
}
}
精彩评论