开发者

Portable user defined character class division in C89 by a lookup table, would you do this?

static const int class[UCHAR_MAX] =

{ [(unsigned char)'a'] = LOWER, /*macro value classifying the characters*/
  [(unsigned char)'b'] = LOWER,
.
.
.
}

This is ju开发者_Python百科st an idea. Is it a bad one?


Designated initializers are in C99, not C89. They also exist as a GCC extension for C89, but will not be portable.

Other than that, the use of lookup tables is a common way to handle classification of a small number of objects quickly.

Edit: One correction though: The size of the array should be UCHAR_MAX+1


BTW, GCC's designated initializer extensions allow for

static const int class[] = {
    [0 ... UCHAR_MAX] = UNCLASSIFIED,
    [(unsigned)'0' ... (unsigned)'9'] = DIGIT,
    [(unsigned)'A' ... (unsigned)'Z'] = UPPER,
    [(unsigned)'a' ... (unsigned)'z'] = LOWER,
 };

initializers applying to ranges of indices, with later initializations overriding earlier ones.

Very non-standard, though; this isn't in C89/C90 nor C99.


Unfortunately, that is not portable in C89/90.

$ gcc -std=c89 -pedantic test.c -o test
test.c:4: warning: ISO C90 forbids specifying subobject to initialize
test.c:5: warning: ISO C90 forbids specifying subobject to initialize


Aside from using int rather than unsigned char for the type (and thereby wasting 768 bytes), I consider this a very good idea/implementation. Keep in mind that it depends on C99 features, so it won't work with old C89/C90 compilers.

On the other hand, simple conditionals should be the same speed and much smaller in code size, but they can only represent certain natural classes efficiently.

#define is_ascii_letter(x) (((unsigned)(x)|32)-97<26)
#define is_digit(x) ((unsigned)(x)-'0'<10)

etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜