Efficient algorithm to find first available name
I have an array containing names of items. I want to give the user the option to create items without specifying their name, so my program will have to supply a unique default name, like "Item 1".
The challenge is that the name has to be unique so i have to check all the array for that default name, and if there is an item with the same name i have to change the name to be "Item 2" and so on until i find an available name.
The obvious solution will be something like that:
String name = "Item ";
for (int i = 0; !isAvailable(name + i) ; i++);
My problem with that algorithm is that it runs at O(N^2).
I wonder if there is a known (or new) more efficient algorithm to solve this case.
In other words my question is开发者_高级运维 this: Is there any algorithm that finds the first greater-than-zero number that dose NOT exist in a given array, and runs at less that N^2?
You can certainly do it in O(N) time, N being the number of items in your array:
- One of "Item 1", "Item 2", ... "Item N+1" must be free, so create an array of N+1 flags.
- Traverse the items, and for each name if it is of form "Item k" with 0 < k <= N+1, set that flag.
- Scan the flag array for the first clear flag.
Additional memory requirement is N+1 bits, which certainly beats any data structure that actually stores all N names.
Yes, there is.
First sort the array. Then run through it and return the first element whose value is not equal to its index (plus 1). The sort is O(n log n), the final step is O(n), so the entire thing is O(n log n).
If you put all items into a hash table, you can do it in O(n) at the cost of some space and an additional O(1) step at creation of new items. Since each element needs to be visited, O(n) is clearly optimal.
I'd be interested to see if there's an O(n) way to do this, without using any "persistent" data structure like the hash table. (And assuming unbounded integers, otherwise a bucket sort could be used as an O(n) sorting algorithm).
Why not just keep track of the maximum number to date, and when you need a new one, increment it?
Insert all of the existing names into a hash table. Repeat your loop, but make isAvailable check the hash table. Assuming a decent hash, it's O(nh) where h is the cost of evaluating the hash.
You could try to do the following:
first:
- loop through the list, and get all numbered items, this is complexity N
- for every numbered item, put the item in a tree (in C++: std::map), this is complexity log(N)
So now you have built up a map with the used numbers, with complexity "N x log(N)"
Next, iterate to the tree and as soon you see a 'hole', use the number. Worst case is complexity N.
So in total, the complexity is N x log(N) + N, or simplified: N log(N).
If there will be only one user accessing this array, why not use the number of nanoseconds? This, of course, assumes that the user is much slower than a computer, which seems to be a safe assumption.
That way, you have a O(1) cost for determining a unique name.
A logarithmic-time approach, assuming that you never leave "holes" by deleting items:
// Inverse binary search for the last one.
int upperBound = 1;
while(isInUse(upperBound)) upperBound *= 2;
// Standard binary search for the end once we have a ballpark.
int lowerBound = upperBound / 2;
while(lowerBound < upperBound - 1)
{
int midpoint = (lowerBound + upperBound) / 2;
if (isInUse(midpoint))
lowerBound = midpoint;
else
upperBound = midpoint;
}
return upperBound;
If item numbers can be freed by deleting, nothing short of linear search will work unless you also keep a "free list" and pick from that.
精彩评论