code review: finding </body> tag reverse search on a non-null terminated char str [closed]
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
开发者_运维技巧 Improve this questionsrc
is a non-null terminated char string whose length is data_len
.
I want to start from the end of this array, and find the first occurrence of html </body>
tag.
find_pos
should hold the position of the </body>
tag with src
Does the code below look correct to you?
char *strrcasestr_len(const char *hay, size_t haylen, const char *ndl,size_t ndllen)
{
char *ret = NULL;
int i;
for (i = haylen - ndllen; i >= 0; i--) {
if (!strncasecmp(&hay[i], ndl, ndllen)) {
break;
}
}
if (i == -1)
return ret;
else
return (char *)&hay[i];
}
This should do it, very very fast.
char const* find_body_closing_tag( char const* const src, size_t const data_len )
{
static char table[256];
static bool inited;
if (!inited) {
table['<'] = 1;
table['/'] = 2;
table['b'] = table['B'] = 3;
table['o'] = table['O'] = 4;
table['d'] = table['D'] = 5;
table['y'] = table['Y'] = 6;
table['>'] = 7;
inited = true;
}
for( char const* p = src + data_len - 7; p >= src; p -= 7 ) {
if (char offset = table[*p]) {
if (0 == strnicmp(p - (offset-1), "</body>", 7)) return p - (offset-1);
}
}
return 0;
}
Another very fast approach would be using SIMD to test 16 consecutive characters against '>'
at once (and this is what strrchr
or memrchr
ought to be doing).
精彩评论