Find the number in an array which occurred most number of times
Given an integer array, i need to find which number occurred most number of times. I have written algorithm as below.
Use a map to store number and number of times it occurred.
map<int, int>
Key: represents the number
value: represents number of times key occurred.- Scan input array and update the map with number and number of occurances.
- Iterate through map from the begining to end. Find the key for which maximum value is present. This key becomes the number which occurred most number of times.
I implemented the algorithm as below.
#include <iostream>
#include <map>
using namespace std;
int main()
{
int a[10] = {1,2,3,2,1,3,2,4,1,1}; //Input array: hardcoded for testing
map<int, int> m;
for(int i=0;i<10;i++)
{
m[a[i]]++; //Increment the value of key for counting occurances
}
int mostNumTimes = 0;
int number = -999; //-999 represents invalid number
map<int,int>::iterator it = m.begin();
for( ;it != m.end(); it++) //Find the number which occurred
{ //most number of times
if(it->second > mostNumTimes)
{
mostNumTimes = it->second;
number = it->first;
}
}
if(number != -999) //Print number and number of times it occurred
{
cout<<"Number: "<<number<<endl;
cout<<"Number of times occured: "<<mostNumTimes<<endl;
}
else
{
cout<<"Input array is empty"<<endl;
}
return 0;
}
Output:
Number: 1
Number of times occured: 4
Question: Are there any optimal ways of d开发者_开发百科oing it ? In the end, i am iterating through entire map as I couldn't find any member function for finding key whose value is highest in map. can i avoid iterating all keys ?
Note: I am not considering if multiple numbers occuring same number of times. I am finding first number with most occurances.
You could maintain a current max (count and int value) as you iterate the values. On each increment in the map, you could update the values so you don't have to iterate at the end.
map<int, int> m;
int currentMax = -999;
int maxCount = 0;
for(int i=0;i<10;i++)
{
int updated = m[a[i]]++; //Increment the value of key for counting occurances
updated++; // due to post increment
if (maxCount < updated) {
maxCount = updated;
currentMax = i;
}
}
Because this is an entertaining exercise, it seems we're all assuming that map iteration is cheap. While iterating a map is also O(N), it's much more expensive than iterating over a vector or array. So what's cheaper, iterating a possibly reduced size map, or doing a really basic if check that will trigger two assignments at some percentage? Both your solution and this one are O(N) assuming you change to use an unordered map. Right now you're not, so each insert is log(n) and actually I think switching to an unordered map will be your biggest gain at this point.
Your algorithm is pretty good. It's actually O(N Log N), because of the N std::map (a tree based map) insertions you're doing (Log N each). This dominates the time complexity of the algorithm, as the second phase is linear.
An improvement would be to use a hash map, giving you a linear algorithm overall.
Sort the array so you have...
{1,1,1,1,2,2,2,3,3,4,4}
Then have a currentValue variable and when the value doesn't match set it, when it does, increment the count... i.e. (pseudo code)
currentValue = 0;
currentCount = 0;
maxValue = 0;
maxCount = 0;
for(int value in array) {
if(currentValue == value) {
currentCount++;
} else {
// is this greater than max count
if(currentCount > maxCount) {
maxCount = currentCount;
maxValue = currentValue;
}
// reset values
currentValue = value;
currentCount = 0;
}
}
Now you have the value which occurred most in maxValue and the number of times it occurred in maxCount.
First, you should get rid of the invalid number -999. Just ask for map.empty() before continuing.
Then, I think it is invalid to increment elements in the map that do not exist before. I assume that a new member is created with unitialized (random) value, as there is no default constructor for int.
You can do something else:
map<int, int>::iterator it = m.find(i);
if (it != m.end())
m.second++;
if (m.second > mostTimes) {
// reset mostTimes and maxNumber = m.first here
}
} else {
m[i] = 1;
}
This operation is O(n) and therefore has the same time complexity class as iterating through the map again to find the max element (where in the worst case, all numbers in input are different and the map would have the same amount of members n than the input array). The difference though is that mostTimes and maxNumbers may get overwritten a lot of times and it may happen that they do not fit into CPU registers and a lot of RAM accesses happen. So probably doing the iteration afterwards is faster in practice.
linear iteration is required (as people have already mentioned), but given order is not important, you could fold the array? Doing so saves you potentially two map updates for elements that are the same, i.e.
int i = 0;
int j = sizeof(a)/sizeof(int);
for(;i < j;i++, j--)
{
if (a[i] == a[j])
{
update<map_t, 2>(m, a[i]);
}
else
{
update<map_t, 1>(m, a[i]);
update<map_t, 1>(m, a[j]);
}
}
// if array size is odd...
if (i == j)
update<map_t, 1>(m, a[i]);
here update is a simple function because I'm too lazy to type the same code...
template <typename M, int DEF>
void update(M& m, int v)
{
typename M::iterator it = m.find(v);
if (it != m.end())
it->second += DEF;
else
{
m.insert(pair<int, int>(v, DEF));
}
}
everything else remains the same, i.e. your code is good, and only slight improvements are possible...
精彩评论