开发者

Finding a nonexisting key in a std::map

Is there a way to find a nonexisting key in a map?

I am using std::map<int,myclass>, and I want to automatically generate a key for new items. Items may be deleted from the map in different order from their insertion.

The myclass items may, or may not be identical, so they can not serve as a key by themself.

During the run time of the program, there is no limit to the number of items that are generated and deleted, so I can not use a counter as a key.

An alternative data structure that have the same functionality and performance will do.

Edit

I trying to build a container for my items - such that I can delete/modify items according to their keys, and I can iterate over the items. The key value itself means nothing to me, however, other objects will store those keys for their internal usage.

The reason I can not use incremental counter, is that during the life-span of the program they may be more than 2^32 (or theoretically 2^64) items, however item 0 may theoretically still exist even after all other items are deleted.

It would be nice to ask std::map for the lowest-value non-used key, so i can use it for new items, ins开发者_JAVA技巧tead of using a vector or some other extrnal storage for non-used keys.


I'd suggest a combination of counter and queue. When you delete an item from the map, add its key to the queue. The queue then keeps track of the keys that have been deleted from the map so that they can be used again. To get a new key, you first check if the queue is empty. If it isn't, pop the top index off and use it, otherwise use the counter to get the next available key.


Let me see if I understand. What you want to do is

look for a key. If not present, insert an element.

Items may be deleted.

Keep a counter (wait wait) and a vector. The vector will keep the ids of the deleted items. When you are about to insert the new element,look for a key in the vector. If vector is not empty, remove the key and use it. If its empty, take one from the counter (counter++). However, if you neveer remove items from the map, you are just stuck with a counter.

Alternative: How about using the memory address of the element as a key ?


I would say that for general case, when key can have any type allowed by map, this is not possible. Even ability to say whether some unused key exists requires some knowledge about type.

If we consider situation with int, you can store std::set of contiguous segments of unused keys (since these segments do not overlap, natural ordering can be used - simply compare their starting points). When a new key is needed, you take the first segment, cut off first index and place the rest in the set (if the rest is not empty). When some key is released, you find whether there are neighbour segments in the set (due to set nature it's possible with O(log n) complexity) and perform merging if needed, otherwise simply put [n,n] segment into the set.

in this way you will definitely have the same order of time complexity and order of memory consumption as map has independently on requests history (because number of segments cannot be more than map.size()+1)

something like this:

class TKeyManager
{
public:
    TKeyManager()
    {
        FreeKeys.insert(
          std::make_pair(
            std::numeric_limits<int>::min(),
            std::numeric_limits<int>::max());
    }
    int AlocateKey()
    {
        if(FreeKeys.empty())
            throw something bad;
        const std::pair<int,int> freeSegment=*FreeKeys.begin();
        if(freeSegment.second>freeSegment.first)
            FreeKeys.insert(std::make_pair(freeSegment.first+1,freeSegment.second));
        return freeSegment.first;
    }
    void ReleaseKey(int key)
    {
        std:set<std::pair<int,int>>::iterator position=FreeKeys.insert(std::make_pair(key,key)).first;
        if(position!=FreeKeys.begin())
        {//try to merge with left neighbour
            std::set<std::pair<int,int>>::iterator left=position;
            --left;
            if(left->second+1==key)
            {
                left->second=key;
                FreeKeys.erase(position);
                position=left;
            }
        }
        if(position!=--FreeKeys.end())
        {//try to merge with right neighbour
            std::set<std::pair<int,int>>::iterator right=position;
            ++right;
            if(right->first==key+1)
            {
                position->second=right->second;
                FreeKeys.erase(right);
            }
        }
    }
private:
    std::set<std::pair<int,int>> FreeKeys;
};


Is there a way to find a nonexisting key in a map?

I'm not sure what you mean here. How can you find something that doesn't exist? Do you mean, is there a way to tell if a map does not contain a key?

If that's what you mean, you simply use the find function, and if the key doesn't exist it will return an iterator pointing to end().

if (my_map.find(555) == my_map.end()) { /* do something */ }

You go on to say...

I am using std::map, and I want to automatically generate a key for new items. Items may be deleted from the map in different order from their insertion. The myclass items may, or may not be identical, so they can not serve as a key by themself.

It's a bit unclear to me what you're trying to accomplish here. It seems your problem is that you want to store instances of myclass in a map, but since you may have duplicate values of myclass, you need some way to generate a unique key. Rather than doing that, why not just use std::multiset<myclass> and just store duplicates? When you look up a particular value of myclass, the multiset will return an iterator to all the instances of myclass which have that value. You'll just need to implement a comparison functor for myclass.


Could you please clarify why you can not use a simple incremental counter as auto-generated key? (increment on insert)? It seems that there's no problem doing that.


  • Consider, that you decided how to generate non-counter based keys and found that generating them in a bulk is much more effective than generating them one-by-one.
  • Having this generator proved to be "infinite" and "statefull" (it is your requirement), you can create a second fixed sized container with say 1000 unused keys.
  • Supply you new entries in map with keys from this container, and return keys back for recycling.
  • Set some low "threshold" to react on key container reaching low level and refill keys in bulk using "infinite" generator.

The actual posted problem still exists "how to make efficient generator based on non-counter". You may want to have a second look at the "infinity" requirement and check if say 64-bit or 128-bit counter still can satisfy your algorithms for some limited period of time like 1000 years.


use uint64_t as a key type of sequence or even if you think that it will be not enough

struct sequence_key_t {
    uint64_t upper; 
    uint64_t lower; 
    operator++();
    bool operator<() 
};

Like:

sequence_key_t global_counter;

std::map<sequence_key_t,myclass> my_map;

my_map.insert(std::make_pair(++global_counter, myclass()));

and you will not have any problems.


Like others I am having difficulty figuring out exactly what you want. It sounds like you want to create an item if it is not found. sdt::map::operator[] ( const key_type& x ) will do this for you.

std::map<int, myclass> Map;
myclass instance1, instance2;

Map[instance1] = 5;
Map[instance2] = 6;

Is this what you are thinking of?


Going along with other answers, I'd suggest a simple counter for generating the ids. If you're worried about being perfectly correct, you could use an arbitrary precision integer for the counter, rather than a built in type. Or something like the following, which will iterate through all possible strings.

void string_increment(std::string& counter)
{
    bool carry=true;
    for (size_t i=0;i<counter.size();++i)
    {
        unsigned char original=static_cast<unsigned char>(counter[i]);
        if (carry)
        {
            ++counter[i];
        }
        if (original>static_cast<unsigned char>(counter[i]))
        {
            carry=true;
        }
        else
        {
            carry=false;
        }
    }
    if (carry)
    {
        counter.push_back(0);
    }
}

e.g. so that you have:

std::string counter; // empty string
string_increment(counter); // now counter=="\x00"
string_increment(counter); // now counter=="\x01"
...
string_increment(counter); // now counter=="\xFF"
string_increment(counter); // now counter=="\x00\x00"
string_increment(counter); // now counter=="\x01\x00"
...
string_increment(counter); // now counter=="\xFF\x00"
string_increment(counter); // now counter=="\x00\x01"
string_increment(counter); // now counter=="\x01\x01"
...
string_increment(counter); // now counter=="\xFF\xFF"
string_increment(counter); // now counter=="\x00\x00\x00"
string_increment(counter); // now counter=="\x01\x00\x00"
// etc..


Another option, if the working set actually in the map is small enough would be to use an incrementing key, then re-generate the keys when the counter is about to wrap. This solution would only require temporary extra storage. The hash table performance would be unchanged, and the key generation would just be an if and an increment.

The number of items in the current working set would really determine if this approach is viable or not.


I loved Jon Benedicto's and Tom's answer very much. To be fair, the other answers that only used counters may have been the starting point.

Problem with only using counters

  • You always have to increment higher and higher; never trying to fill the empty gaps.
  • Once you run out of numbers and wrap around, you have to do log(n) iterations to find unused keys.

Problem with the queue for holding used keys

  • It is easy to imagine lots and lots of used keys being stored in this queue.

My Improvement to queues!

Rather than storing single used keys in the queue; we store ranges of unused keys.

Interface

using Key = wchar_t; //In my case

struct Range
{
 Key first;
 Key last;

 size_t size() {  return last - first + 1; }
};

bool operator< (const Range&,const Range&);
bool operator< (const Range&,Key);
bool operator< (Key,const Range&);

struct KeyQueue__
{
 public:
    virtual void addKey(Key)=0;
    virtual Key getUniqueKey()=0;
    virtual bool shouldMorph()=0;

 protected:
    Key counter = 0;
    friend class Morph;
};

struct KeyQueue : KeyQueue__
{
 public: 
    void addKey(Key)override;
    Key getUniqueKey()override;
    bool shouldMorph()override;

 private:
    std::vector<Key> pool;
    friend class Morph;
};

struct RangeKeyQueue : KeyQueue__
{
 public: 
    void addKey(Key)override;
    Key getUniqueKey()override;
    bool shouldMorph()override;

 private:
    boost::container::flat_set<Range,std::less<>> pool;
    friend class Morph;
};

void morph(KeyQueue__*);

struct Morph
{
 static void morph(const KeyQueue &from,RangeKeyQueue &to);
 static void morph(const RangeKeyQueue &from,KeyQueue &to);
};

Implementation

Note: Keys being added are assumed to be key not found in queue

// Assumes that Range is valid. first <= last
// Assumes that Ranges do not overlap
bool operator< (const Range &l,const Range &r)
{
 return l.first < r.first;
} 

// Assumes that Range is valid. first <= last
bool operator< (const Range &l,Key r)
{
 int diff_1 = l.first - r;
 int diff_2 = l.last - r;
 return diff_1 < -1 && diff_2 < -1;
} 

// Assumes that Range is valid. first <= last
bool operator< (Key l,const Range &r)
{ 
 int diff = l - r.first;
 return diff < -1;
} 

void KeyQueue::addKey(Key key)
{
 if(counter - 1 == key) counter = key;
 else pool.push_back(key);   
}

Key KeyQueue::getUniqueKey()
{
    if(pool.empty()) return counter++;

    else
    {
     Key key = pool.back();
     pool.pop_back();
     return key;
    }
}

bool KeyQueue::shouldMorph()
{
 return pool.size() > 10;
}


void RangeKeyQueue::addKey(Key key)
{
    if(counter - 1 == key) counter = key;

    else
    {
        auto elem = pool.find(key);
        
        if(elem == pool.end()) pool.insert({key,key}); 
        
        else // Expand existing range
        {
         Range &range = (Range&)*elem;

         // Note at this point, key is 1 value less or greater than range
         if(range.first > key) range.first = key;
         else range.last = key;
        }  
    }
}

Key RangeKeyQueue::getUniqueKey()
{
    if(pool.empty()) return counter++;

    else
    {
     Range &range = (Range&)*pool.begin();
     Key key = range.first++;

     if(range.first > range.last) // exhausted all keys in range
        pool.erase(pool.begin());

     return key;
    }
}

bool RangeKeyQueue::shouldMorph()
{
 return pool.size() == 0 || pool.size() == 1 && pool.begin()->size() < 4;  
}


void morph(KeyQueue__ *obj)
{
    if(KeyQueue *queue = dynamic_cast<KeyQueue*>(obj))
    {
     RangeKeyQueue *new_queue = new RangeKeyQueue();
     Morph::morph(*queue,*new_queue);
     obj = new_queue;
    }
    else if(RangeKeyQueue *queue = dynamic_cast<RangeKeyQueue*>(obj))
    {
     KeyQueue *new_queue = new KeyQueue();
     Morph::morph(*queue,*new_queue);
     obj = new_queue;
    }
}

void Morph::morph(const KeyQueue &from,RangeKeyQueue &to)
{
    to.counter = from.counter;
    for(Key key : from.pool) to.addKey(key);
}

void Morph::morph(const RangeKeyQueue &from,KeyQueue &to)
{
    to.counter = from.counter;

    for(Range range : from.pool)
    while(range.first <= range.last) 
       to.addKey(range.first++);
}

Usage:

int main()
{
    std::vector<Key> keys;
    KeyQueue__ *keyQueue = new KeyQueue();

    srand(time(NULL));
    bool insertKey = true;

    for(int i=0; i < 1000; ++i)
    {
        if(insertKey)
        {
         Key key = keyQueue->getUniqueKey();
         keys.push_back(key); 
        }
        else
        {
            int index = rand() % keys.size();
            Key key = keys[index];
            keys.erase(keys.begin()+index);
            keyQueue->addKey(key);
        }  

        if(keyQueue->shouldMorph())
        {
         morph(keyQueue); 
        } 

        insertKey = rand() % 3; // more chances of insert
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜