开发者

std::set::insert, how bad can I hint?

I'm doing lots and lots of inserts of std::pair<int, int> into a std::set, and it's taking longer than I'd like. When I wrote the code I figured I'd look at using the hint iterator form of insert later if it turned out to be a bottleneck; well, now it's profiled and it is a bottleneck. So I want to use the iterator hint.

However, I'm not always going to know a good p开发者_StackOverflow中文版osition to insert my pairs. I typically insert them in batches (a batch in this case is on the order of 0.01% of the total input size, duplicates included) of increasing set-order, but when a batch is inserted, I do not know where the next one should start. How is the hint used? Does insert do something like a binary search from the suggested position? How bad would it be to use a bad hint, typically?


I suggest just reading what the compiler reads: the header file for #include <set>. On my system (GNU libstdc++ 4.5.1) I can read the following self-explanatory text:

  /**
   *  @brief Attempts to insert an element into the %set.
   *  @param  position  An iterator that serves as a hint as to where the
   *                    element should be inserted.
   *  @param  x  Element to be inserted.
   *  @return  An iterator that points to the element with key of @a x (may
   *           or may not be the element passed in).
   *
   *  This function is not concerned about whether the insertion took place,
   *  and thus does not return a boolean like the single-argument insert()
   *  does.  Note that the first parameter is only a hint and can
   *  potentially improve the performance of the insertion process.  A bad
   *  hint would cause no gains in efficiency.
   *
   *  For more on @a hinting, see:
   *  http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt07ch17.html
   *  
   *  Insertion requires logarithmic time (if the hint is not taken).
   */
  iterator
  insert(iterator __position, const value_type& __x)
  { return _M_t._M_insert_unique_(__position, __x); }

Takeaway:

  1. A bad hint would cause no gains in efficiency
  2. Insertion is O(log n)
  3. You can read even more about insertion hints in the GNU libstdc++ manual.


If you check the file bits/stl_tree.h (in GNU libstdc++), you'll find that the _M_insert_unique member function with a hint argument looks one node to the left of the hint, then one node to the right, then defaults to calling the ordinary insert routine.

It calls key_compare at least once (if the set is not empty) and at most three times. Going from one node to the next or previous is a matter of following a pointer since (IIRC) std::set and friends are threaded trees.

So, how bad a bad hint is depends on the comparison routine, and on whether your std::set's allocator packs nodes close in memory.


A hint is good if it is the right hint - the position to use for an insert. Works if you insert objects sequentially, for example.

If the hint is not correct, it has no effect and you get a non-hinted insert.


If you're building the set all at once before you use it, you can use a vector instead and sort it before you use it. You can use the binary_search, lower_bound, upper_bound, and equal_range algorithms on a sorted vector for fast lookups. You can also use merge or inplace_merge to combine sorted vectors, and set_difference, set_intersection, and set_union to do other common set operations.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜