开发者

Building symbol table using hash table

I'm trying to build a symbol table using hash table. The general idea is

int alpha;
2 int beta;
3 alpha = 0; // the alpha declared in line 1
4 beta = 0; // the beta declared in line 2
5 gamma = 0; // Error! gamma hasn't been declared.
6 {
7 int beta; // This beta shadows the one declared in line 2.
8 int gamma;
9 alpha = 0; // the alpha declared in line 1
10 beta = 0; // the beta declared in line 7
11 gamma = 0; // the gamma declared in line 8
12 }

and so on. You only c开发者_如何学JAVAan use vector, list, stack, and queue library and try to make it as fast as it can. My idea was on each scope, I declare hashtable in the list and save all the information to that table, and whenever I have new scope, I push_back new hashtable into the list. But it seems like this approach is very slow when the program is looking for on item that is out of scope far far away since you have to look for each scope to find the item.

Do you guys have any idea to implement this scope that is way faster than this? It should be way faster cause my implementation is slower than the "slow version they provided"

Thanks alot!


Have just one hash table for everything, and a stack of context objects. Push a new object on the stack whenever you see a '{'. When you shadow a variable, remember the old symbol in your context object and just overwrite it in the hash table. When you see a '}' and pop the context, restore the symbols you have remembered there.


When I did this, I used the hash table for symbols, not for entities. For each symbol, I had a list of entities. Each entity was in two lists, one from the symbol, and the second from the scope. The list from the symbol was managed like a stack: when I accessed a symbol, the entity in scope was always the first in the list. When I left a scope, I walked its list of entities, and removed them from the list of symbols.

This was some time ago, before the STL (and even before C++); I implemented everything by hand, and used invasive chaining for the lists, with the algorithm so designed that I could remove an element from a list without knowing where the list itself was. (This is relatively easy with hand written double linked lists.) Today, with the STL (and with the importance of locality of access), I'd probably simply put a pointer to the head of the list in each entity. With the STL, and considerations of locality, I'd probably use std::vector for the list for each entity (the symbol table thus being a map of std::string to std::vector). Symbol table lookup, once the entry is found, is simply calling back() on the vector (after checking that it isn't empty, if you don't purge the entry when the vector becomes empty). When you create new entity, in a new scope, you call push_back on the vector, and save the address of the vector in an entry for the scope (a std::stack of std::vector<std::vector<Entity>*>); when leaving scope, you iterate over the vector at the top of this stack, calling pop_back on all of its entries, then pop it off the stack. (And obviously, when entering a new scope, you push a new empty vector onto the scope stack.)

Maintaining the double list means that for all operations, you know exactly which element to access, without having to iterate over anything; there's never any access to an entity to see if it's the one you're looking for. In my day, it was fairly straightforward, but a lot of code, requiring considerable care, and my implemention probably wouldn't be that fast today, because of poor locality. Today, with the STL, you need a lot less code, and by using std::vector for all of the lists, you get slightly better locality; you still have to be careful, so that you don't end up saving iterators which will be invalidated. (But implemented as described above, you shouldn't need to; all accesses are always to the last element of the vector, so saving the address of the vector itself is sufficient. Provided, of course, that your hash implementation doesn't move the vectors.)


This is an OK idea. In most languages, scopes are rarely nested very deeply, perhaps an average of 3 or 4 nested scopes. So I wouldn't worry about things that are "far, far away", as these will be pathological cases.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜