Why use integers for tokens?
Is there any good reason for using numbers for identifying tokens, nowadays? I am following Crafting a Compiler
.
The code the author presents is here:
public class Token {
public final static int ID = 0, FLTDCL = 1, INTDCL = 2, PRINT = 3,
ASSIGN = 4, PLUS = 5, MINUS = 6, EOF = 7, INUM = 8, FNUM = 9;
public final static String[] token2str = new String[] { "id", "fltdcl",
"intdcl", "print", "assign", "plus", "minus", "$", "inum", "fnum" };
public final int type;
public final String val;
public Token(int type) {
this(type, "");
}
public Token(int type, String val) {
this.type = type;
this.val = val;
}
public String toString() {
return "Token type\t" + token2str[type] + "\tval\t" + val;
}
}
Instead of using the ugly arrays, wouldn't it be smarter to modify the constructors to accept strings for the type
variable instead of integers? Then we could get rid of
public final static int ID = 0, FLTDCL = 1, INTDCL = 2, PRINT = 3,
ASSIGN = 4, PLUS = 5, MINUS = 6, EOF = 7, INUM = 8, FN开发者_运维百科UM = 9;
or is it needing later, being that using a string instead would be worse?
There are several benefits:
- It's faster, since comparing two integers takes (in your average compiled language) only a few instructions, while comparing strings takes O(n) time where n is the length of the larger token. Compilers need this extra bit of speed.
- In C, C++ and Java, you can
switch
on anint
but not on a string. - Mistyping a token name will be a compile-time error instead of a hard-to-debug runtime error.
精彩评论