开发者

Removing from a HashTable C++

How do u remove from an array based hashtable? I need to prepare to remove several symbols from my table. If i dump what I want to delete in an character array of a fixed size, how would find the ones i would "potentially" delete?

bool hashmap::get(char const * const symbol, stock& s) const
{
    int hashVal = this->hashStr( symbol );
    int initialHash = -1;

    while ( hashTable[hashVal].m_symbol != NULL )
    {   // try to find a match for the stock associated with the symbol.

        if ( initialHash == -1 )
        {
             initialHash = hashVal;
        }
        if ( strcmp( hashTable[hashVal].m_symbol, symbol ) == 0 )
        {
            s = &hashTable[hashVal];
            return true;
        }
        ++hashVal %= maxSize;
    }
    if ( hashTable[hashVal].m_symbol == NULL || hashVal == initialHash )
    {
         return false;
    }
    else return true;
}

bool hashmap::put(const stock& s, int& usedIndex, int& hashIndex, int& symbolHash)
{
    hashIndex = this->hashStr( s.m_symbol ); // Get remainder, Insert at that index.
    symbolHash = (int&)s.m_symbol;
    usedIndex = hashIndex;

    while ( hashTable[hashIndex].m_symbol != NULL )
    {
        ++usedIndex %= maxSize; // if necessary wrap index around
        if ( hashTable[usedIndex].m_symbol == NULL )
        {
            hashTable[usedIndex] = s;
            return true;
        }
        else if ( strcmp( hashTable[usedIndex].m_symbol , s.m_symbol ) == 0 )
        {
            return false; // prevent duplicate entry
        }
    }
    hashTable[hashIndex] = s; // insert if no collision 
    return true;
}

bool hashmap::remove(char const * const symbol)
{
    int hashVal = this->hashStr( symbol );
    int initialHash = -1;

    while ( hashTable[hashVal].m_symbol != NULL  )
    {
        if ( initialHash == -1 )
        {
            initialHash = hashVal;
        }
        if ( strcmp( hashTable[hashVal].m_symbol, symbol ) == 0 )
        {
            hashTable[hashVal].m_symbol = NULL;
            return true;
        }
        ++hashVal %= maxSize; // wrap around if needed
    }   // go to the next cell if not found

    if ( hashVal != initialHash && hashTable[hashVal].m_symbol != NULL )
    {
        return true;
    }
    return false;
}

int hashmap::hashStr(char const * const str)
{   
    size_t length = strlen( str );
    int hash = 0;
    for ( unsigned i = 0; i &l开发者_运维技巧t; length; i++ )
    {
         hash = 31 * hash + str[i];
    }
    return hash % maxSize;
 }


For the type of hash collision you're using, deletion is problematic at best. Then again, this style of hashing doesn't do anything else very well either -- it leads to significant clustering even at relatively low loading ratios.

If you want to support deletion, you probably want to use chaining to resolve collisions. In this case, removal is simple and straightforward. You hash to find the right linked list, search for the item in the linked list, and (if you find it) remove it from the linked list.

Edit: Removal is more difficult than may be immediately obvious. For example, assume you want to remove something that hashed to position N in the table, but you find it at position N+3. There could, however, be something at position N+4 that (just for example) may have originally hashed to position N-1. So, if you insist on trying to delete from a table like this, you have to walk from wherever this hashed to all the way to the next empty spot in the table, and re-hash each of those item to figure out where it originally came from, and assure that if you remove something, every one of those chains remains intact from its original hash point to wherever it ended up.

It is possible to make this work -- but it's not easy. Given how poorly linear probing works in general, it's just not worth it!


People say linear probing is terrible, but it's not.

If you want a fast hash table, linear probing may be for you. For example, if you keep the load factor below, say 2/3, then you have to examine 3.5 elements on average for a successful search. If your equals comparison operation can be done without chasing another pointer, then that can be fast, because those 3.5 elements cost you only one cache miss. (So if your key is an integer, or a short std::string it can be fast).

Compare to a chained table, which has two cache misses before you even compare your lookup against the first stored key.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜