storing random values in arrays considering duplicates
This a new version of this post in order to isolate the programming question from the probability question.
I want to store a number of e.g. 25 randomly generated numbers between 1 and 365 in an array. But i need to keep track of duplicates. Here's how i was thinking of doing it:
create 4 arrays: a master array, an array for 2 duplicates, an array for 3 duplicates and one for more than 3 duplicates
开发者_Go百科add each generated number one by one into the master array. But before doing so, loop through the array to see if it's in there already. If so, add it to the second array, but before doing so repeat the above process and so on
at the end of the process i could count the non-null values in each array to know how many unique numbers i have, how many came up twice etc
it doesn't seem to be a very efficient algorithm. Any suggestions to improve it?
Can my suggested approach be considered to be Big O(n) i.e. linear?
Why are you using arrays? A HashMap or other map structure would seem to make more sense. Here's how I would do it.
- Instantiate a new, empty hashmap from birthdays to integers
- Generate a random birthday.
- Check if the birthday is in the hashmap. If it isn't, add it with a value of '1'. It it is, increment the value at that birthday.
Now you can get the number of unique dates generated by the number of keys in the hashmap, and any information about the number of duplicates from the values.
Here is how I think the problem can be solved.
- Have two arrays: a master one containing the birthdays and one containing the number of times it has repeated.
- Generate a random birthday.
- Loop through the main array and see if it is already there.
- If it is not there, add it to the main array and in the same index position, store 1 in the other array. ( If you add the birthday to master[3], store 1 in number[3])
- If it is already there, add one to the corresponding index in the other array.
- Now your master array will contain the generated birthdays and the corresponding indexes in the other array will have the number of times it is repeated. (master[0] has a date and number[0] has the number of times the date is repeated).
Hope this helps.
in ruby:
birthdays = {}.tap{|h| h.default = 0}
25.times { birthdays[rand(364)+1] += 1 }
puts birthdays.keys # all the uniq birthdays
puts birthdays.inspect # all birthdays with counts
idea is to use hash or fixed size array where keys correspond to birthdays and values are how many times that birthday was selected
problem is that you will lose order of birthdays as they are just keys in hash/array but if original order is not important you can always get random sequence of birthdays by doing
birthdays.keys.shuffle
Depending on how many times you will need to generate the random list you might consider creating a list of all the numbers and then using a shuffle sort to permute them, you can then pull consecutive blocks off the array which will be both in random order and unique. So in the worst case the performance would be O(size all numbers)+O(size of numbers needed), and best case is O(size of numbers needed). An example:
private int[] numbers;
private int numberIndex=Integer.MAX_VALUE;
private void initNumbers(){
int[] numbers = new int[365];
for(int i=0;i<365;i++){
numbers[i] = i+1;
}
shuffleSort(numbers);
numberIndex = 0;
}
public int[] getRandom(int howMany){
if(numberIndex > numbers.length-howMany){
initNumbers();
}
int[] ret = new int[howMany];
for(int i=0;i<howMany;i++){
ret[i] = numbers[numberIndex++];
}
return ret;
}
private void shuffleSort(int[] numbers){
Random r = new Random();
int tmp=0, swap=0;
for(int i=numbers.length-1;i>0;i--){
swap = r.nextInt(i);//random number between 0 and i
tmp = numbers[i];
numbers[i] = numbers[swap];
numbers[swap] = tmp;
}
}
}
精彩评论