Searching a C++ Vector<custom_class> for the first/last occurence of a value
I'm trying to work out the b开发者_如何学Goest method to search a vector of type "Tracklet" (a class I have built myself) to find the first and last occurrence of a given value for one of its variables. For example, I have the following classes (simplified for this example):
class Tracklet {
TimePoint *start;
TimePoint *end;
int angle;
public:
Tracklet(CvPoint*, CvPoint*, int, int);
}
class TimePoint {
int x, y, t;
public:
TimePoint(int, int, int);
TimePoint(CvPoint*, int);
// Relevant getters and setters exist here
};
I have a vector "vector<Tracklet> tracklets
" and I need to search for any tracklets with a given value of "t" for the end timepoint. The vector is ordered in terms of end time (i.e. tracklet.end->t
).
I'm happy to code up a search algorithm, but am unsure of which route to take with it. I'm not sure binary search would be suitable, as I seem to remember it won't necessarily find the first. I was thinking of a method where I use binary search to find an index of an element with the correct time, then iterate back to find the first and forward to find the last. I'm sure there's a better way than that, since it wastes binary searches O(log n) by iterating.
Hopefully that makes sense: I struggled to explain it a bit! Cheers!
If the vector is sorted and contains the value, std::lower_bound
will give you an iterator to the first element with a given value and std::upper_bound
will give you an iterator to one element past the last one containing the value. Compare the value with the returned element to see if it existed in the vector. Both these functions use binary search, so time is O(logN).
To compare on tracklet.end->t
, use:
bool compareTracklets(const Tracklet &tr1, const Tracklet &tr2) {
return (tr1.end->t < tr2.end->t);
}
and pass compareTracklets as the fourth argument to lower_bound
or upper_bound
I'd just use find
and find_end
, and then do something more complicated only if testing showed it to be too slow.
If you're really concerned about lookup performance, you might consider a different data structure, like a map
with timestamp as the key and a vector
or list
of elements as the value.
A binary search seems like your best option here, as long as your vector remains sorted. It's essentially identical, performance-wise, to performing a lookup in a binary tree-structure.
dirkgently referred to a sweet optimization comparative. But I would in fact not use a std::vector
for this.
Usually, when deciding to use a STL container, I don't really consider the performance aspect, but I do consider its interface regarding the type of operation I wish to use.
std::set<T>::find
std::set<T>::lower_bound
std::set<T>::upper_bound
std::set<T>::equal_range
Really, if you want an ordered sequence, outside of a key/value setup, std::set
is just easier to use than any other.
- You don't have to worry about inserting at a 'bad' position
- You don't have problems of iterators invalidation when adding / removing an element
- You have built-in methods for searching
Of course, you also want your Comparison Predicate to really shine (hopes the compiler inlines the operator() implementation), in every case.
But really, if you are not convinced, try a build with a std::vector
and manual insertion / searching (using the <algorithm>
header) and try another build using std::set
.
Compare the size of the implementations (number of lines of code), compare the number of bugs, compare the speed, and then decide.
Most often, the 'optimization' you aim for is actually a pessimization, and in those rares times it's not, it's just so complicated that it's not worth it.
Optimization:
- Don't
- Expert only: Don't, we mean it
The vector is ordered in terms of time
The start time or the end time?
What is wrong with a naive O(n) search? Remember you are only searching and not sorting. You could use a sorted container as well (if that doesn't go against the basic design).
精彩评论