Confusion about Copy-On-Write and shared_ptr
I have searched the web and read through the Boost documentation about shared_ptr
. There is a response on SO that says that shared_ptr
for Copy-On-Write (COW) suc开发者_如何学JAVAks and that TR!
has removed it from the string libraries. Most advice on SO says to use shared_ptr
rather than regular pointers.
The documentation also talks about using std::unique()
to make a COW pointer, but I haven't found any examples.
Is the talk about having a smart pointer that performs COW for you or about having your object use a new shared_ptr
to a cloned object then modifying the cloned object?
Example: Recipes & Ingredients
struct Nutrients;
struct Ingredient
{
Ingredient(const std::string& new_title = std::string(""))
: m_title(new_title)
{ ; }
std::string m_title;
Nutrients ing_nutrients;
};
struct Milk : public Ingredient
: Ingredient("milk")
{ ; }
struct Cream : public Ingredient
: Ingredient("cream")
{ ; }
struct Recipe
{
std::vector< boost::shared_ptr<Ingredient> > m_ingredients;
void append_ingredient(boost::shared_ptr<Ingredient> new_ingredient)
{
m_ingredients.push_back(new_ingredient);
return;
}
void replace_ingredient(const std::string& original_ingredient_title,
boost::shared_ptr<Ingredient> new_ingredient)
{
// Confusion here
}
};
int main(void)
{
// Create an oatmeal recipe that contains milk.
Recipe oatmeal;
boost::shared_ptr<Ingredient> p_milk(new Milk);
oatmeal.add_ingredient(p_milk);
// Create a mashed potatoes recipe that contains milk
Recipe mashed_potatoes;
mashed_potatoes.add_ingredient(p_milk);
// Now replace the Milk in the oatmeal with cream
// This must not affect the mashed_potatoes recipe.
boost::shared_ptr<Ingredient> p_cream(new Cream);
oatmeal.replace(p_milk->m_title, p_cream);
return 0;
}
The confusion is how to replace the 'Milk' in the oatmeal
recipe with Cream and not affect the mashed_potatoes
recipe.
My algorithm is:
locate pointer to `Milk` ingredient in the vector.
erase it.
append `Cream` ingredient to vector.
How would a COW pointer come into play here?
Note: I am using MS Visual Studio 2010 on Windows NT, Vista and 7.
There are several questions bundled into one here, so bear with me if I don't address them in the order you would expect.
Most advice on SO says to use shared_ptr rather than regular pointers.
Yes and No. A number of users of SO, unfortunately, recommend shared_ptr
as if it were a silver bullet to solve all memory management related issues. It is not. Most advice talk about not using naked pointers, which is substantially different.
The real advice is to use smart managers: whether smart pointers (unique_ptr
, scoped_ptr
, shared_ptr
, auto_ptr
), smart containers (ptr_vector
, ptr_map
) or custom solutions for hard problems (based on Boost.MultiIndex, using intrusive counters, etc...).
You should pick the smart manager to use depending on the need. Most notable, if you do not need to share the ownership of an object, then you should not use a shared_ptr
.
What is COW ?
COW (Copy-On-Write) is about sharing data to "save" memory and make copy cheaper... without altering the semantic of the program.
From a user point of view, whether std::string
use COW or not does not matter. When a string is modified, all other strings are unaffected.
The idea behind COW is that:
- if you are the sole owner of the data, you may modify it
- if you are not, then you shall copy it, and then use the copy instead
It seems similar to
shared_ptr
, so why not ?
It is similar, but both are meant to solve different problems, and as a result they are subtly different.
The trouble is that since shared_ptr
is meant to function seamlessly whether or not the ownership is shared, it is difficult for COW to implement the "if sole owner" test. Notably, the interaction of weak_ptr
makes it difficult.
It is possible, obviously. The key is not to leak the shared_ptr
, at all, and not to use weak_ptr
(they are useless for COW anyway).
Does it matter ?
No, not really. It's been proved that COW is not that great anyway. Most of the times it's a micro optimization... and a micro pessimization at once. You may spare some memory (though it only works if you don't copy large objects), but you are complicating the algorithm, which may slow down the execution (you are introducing tests).
My advice would be not to use COW. And not to use those shared_ptr
either.
Personnally, I would either:
- use
boost::ptr_vector<Ingredient>
rather thanstd::vector< boost::shared_ptr<Ingredient> >
(you do not need sharing) - create a
IngredientFactory
, that would create (and manage) the ingredients, and return aIngredient const&
, theFactory
should outlive anyReceipt
.
EDIT: following Xeo's comment, it seems the last item (IngredientFactory
) is quite laconic...
In the case of the IngredientFactory
, the Receipt
object will contain a std::vector<Ingredient const*>
. Note the raw pointer:
Receipt
is not responsible for the memory, but is given access to it- there is an implicit warranty that the object pointed to will remain valid longer than the
Receipt
object
It is fine to use raw (naked) pointers, as long as you treat them like you would a reference. You just have to beware of potential nullity, and you're offered the ability to reseat them if you so wish -- and you trust the provider to take care of the lifetime / memory management aspects.
You have nothing to worry about. Each Recipe
object has its own vector
, so modifying one won't affect the other, even though both of them happen to contain pointers to the same objects. The mashed-potatoes recipe would only be affected if you changed the contents of the object that p_milk
points at, but you're not doing that. You're modifying the oatmeal.m_ingredients
object, which has absolutely no relation to mashed_potatoes.m_ingredients
. They're two completely independent vector
instances.
精彩评论