Set find member vs. using find on list
Since the items in a Standard Library set container are sorted, will using the find member on the set, in general, perform faster than using the find algorithm on the same items in a sorted lis开发者_StackOverflowt?
Since the list is linear and the set is often implemented using a sorted tree, it seems as though the set-find should be faster.
With a linked list, even a sorted one, finding an element is O(n)
. A set can be searched in O(log n)
. Therefore yes, finding an element in a set is asymptotically faster.
A sorted array/vector can be searched in O(log n)
by using binary search. Unfortunately, since a linked list doesn't support random access, the same method can't be used to search a sorted linked list in O(log n)
.
It's actually in the standard: std::set::find()
has complexity O(log n)
, where n
is the number of elements in the set. std::find()
on the other hand is linear in the length of the search range.
If your generic search range happens to be sorted and has random access (e.g. a sorted vector), then you can use std::lower_bound()
to find an element (or rather a position) efficiently.
Note that std::set
comes with its own member-lower_bound()
, which works the same way. Having an insertion position may be useful even in a set
, because insert()
with a correct hint has complexity O(1)
.
You can generally expect a find
operation to be faster on a Set
than on a List
, since lists are linear access (O(n)), while sets may have near-constant access for HashSets (O(1)), or logarithmic access for TreeSets (O(log n)).
set::find
has a complexity of O(log(n)), while std::find
has a complexity of O(n). This means that std::set::find()
is asymptotically faster than std::find(std::list)
, but that doesn't mean it is faster for any particular data set, or for any particular search.
I found this article helpful on the topic. http://lafstern.org/matt/col1.pdf
You could reconsider your requirements for just a "list" vs. a "set". According to that article, if your program consists primarily of a bunch of insertions at the start, and then after that, only comparisons to what you have stored, then you are better off with adding everything to a vector, using std::sort (vector.begin(), vector.end()) once, and then using lower_bound. In my particular application, I load from a text file a list of names when the program starts up, and then during program execution I determine if a user is in that list. If they are, I do something, otherwise, I do nothing. In other words, I had a single discrete insertion phase, then I sorted, then after that I used std::binary_search (vector.begin(), vector.end(), std::string username) to determine whether the user is in the list.
精彩评论