Which is the fastest STL container for find?

2023-03-27 00:30 问答作者：

Alright as a preface I have a need to cache a relatively small subset of rarely modified data to avoid querying the database as frequently for performance reasons. This data is heavily used in a read-only sense as it is referenced often by a much larger set of data in other tables.

I've written a class which will have the ability to store basically the entirety of the two tables in question in memory while listening for commit changes in conjunction with a thread safe callback mechanism for updating the cached objects.

My current implementation has two std::vectors one for the elements of each table. The class provides both access to the entirety of each vector as well as convenience methods for searching for a specific element of table data via std::find, std::find_if, etc.

Does anyone know if using std::list, std::set, or std::map over std::vector for searching would be preferable? Most of the time that is what will be requested of these containers after populating once from the database when a new connection is made.

I'm also open to using C++0x features supported by VS开发者_如何学Python2010 or Boost.

For searching a particular value, with std::set and std::map it takes O(log N) time, while with the other two it takes O(N) time; So, std::set or std::map are probably better. Since you have access to C++0x, you could also use std::unordered_set or std::unordered_map which take constant time on average.

For find_if, there's little difference between them, because it takes an arbitrary predicate and containers cannot optimize arbitrarily, of course.

However if you will be calling find_if frequently with a certain predicate, you can optimize yourself: use a std::map or std::set with a custom comparator or special keys and use find instead.

A sorted vector using std::lower_bound can be just as fast as std::set if you're not updating very often; they're both O(log n). It's worth trying both to see which is better for your own situation.

Since from your (extended) requirements you need to search on multiple fields, I would point you to Boost.MultiIndex.

This Boost library lets you build one container (with only one exemplary of each element it contains) and index it over an arbitrary number of indices. It also lets you precise which indices to use.

To determine the kind of index to use, you'll need extensive benchmarks. 500 is a relatively low number of entries, so constant factors won't play nicely. Furthermore, there can be a noticeable difference between single-thread and multi-thread usage (most hash-table implementations can collapse on MT usage because they do not use linear-rehashing, and thus a single thread ends up rehashing the table, blocking all others).

I would recommend a sorted index (skip-list like, if possible) to accomodate range requests (all names beginning by Abc ?) if the performance difference is either unnoticeable or simply does not matter.

If you only want to search for distinct values, one specific column in the table, then std::hash is fastest.

If you want to be able to search using several different predicates, you will need some kind of index structure. It can be implemented by extending your current vector based approach with several hash tables or maps, one for each field to search for, where the value is either an index into the vector, or a direct pointer to the element in the vector.

Going further, if you want to be able to search for ranges, such as all occasions having a date in July you need an ordered data structure, where you can extract a range.

Not an answer per se, but be sure to use a typedef to refer to the container type you do use, something like typedef std::vector< itemtype > data_table_cache; Then use your typedef type everywhere.

继续阅读：find optimization stl

Which is the fastest STL container for find?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？