Implementation of Map<string, string> in C
For some reason, I have to implement this myself , and can not use libs. To make it map fast, firstly, I map the key to an integer, and use that integer as an internal key. Then开发者_如何学JAVA I implement the Map, which gives me the mapping function. However, when I use the string key to compute the internal integer key, sometimes I get the same integer from different string. How can I solve the problem?
You cannot avoid this. There are more possible strings than integers, therefore hash collisions are imminent. Read up on hashmaps - it's a data structure that explicitly takes collisions into account and works around them.
A map data structure and "collision" cannot be separated. the way you started your implementation seems fine, here's how you should handle collisions :
Adding a new entry into the map
- calculate
hashcode
forkey
- compute
index
from hashcode (more or lessindex
=hashcode value
%size of keyset
) - if
keyset[index]
is not null- if keyset[index] != key (ie. for strings, use strcmp) increment
index
modulussize of keyset
, then goto 3
- if keyset[index] != key (ie. for strings, use strcmp) increment
- put
value
intoentryset[index]
Getting a value from the map
- calculate
hashcode
forkey
- compute
index
from hashcode (more or lessindex
=hashcode value
%size of keyset
) - if
keyset[index]
is not null- if keyset[index] != key (ie. for strings, use strcmp) increment
index
modulussize of keyset
, then goto 3
- if keyset[index] != key (ie. for strings, use strcmp) increment
- if
keyset[index]
is null return null - return
entryset[index]
Deleting an entry from the map
- calculate
hashcode
forkey
- compute
index
from hashcode (more or lessindex
=hashcode value
%size of keyset
) - if
keyset[index]
is not null- if keyset[index] != key (ie. for strings, use strcmp) increment
index
modulussize of keyset
, then goto 3
- if keyset[index] != key (ie. for strings, use strcmp) increment
- set
keyset[index]
andentryset[index]
to null
As you can see, you can put step 1 to 3 into a function int findIndexFromKey(Map *map, char *key);
and most of the work is done
** EDIT **
Of course, you also have to check if your map is not full before (or while) adding a new entry, otherwise you'll just loop undefinitely.
This is what is known as a collisions, but the simplest is to make each bucket in your Hashmap a list of items with the same hash. Then on a get all you have to do is iterate through the list until you find the item you are looking for.
精彩评论