Searching for 2 consecutive hex values in a char array of a file
I've r开发者_如何学JAVAead a file into an array of characters using fread. Now I want to search that array for two consecutive hex values, namely FF followed by D9 (its a jpeg marker signifying end of file). Here is the code I use to do that:
char* searchBuffer(char* b) {
char* p1 = b;
char* p2 = ++b;
int count = 0;
while (*p1 != (unsigned char)0xFF && *p2 != (unsigned char)0xD9) {
p1++;
p2++;
count++;
}
count = count;
return p1;
}
Now I know this code works if I search for hex values that don't include 0xFF (eg 4E followed by 46), but every time I try searching for 0xFF it fails. When I don't cast the hex values to unsigned char the program doesn't enter the while loop, when I do the program goes through all the chars in the array and doesn't stop until I get an out of bounds error. I'm stumped, please help.
Ignore count, its just a variable that helps me debug.
Thanks in advance.
Why not use memchr()
to find potential matches?
Also, make sure you're dealing with promotions of potentially signed types (char
may or may not be signed). Note that while 0xff
and 0xd9
have the high bit set when looked at as 8-bit values, they are non-negative integer constants, so there is no 'sign extension' that occurs for them:
char* searchBuffer(char* b) {
unsigned char* p1 = (unsigned char*) b;
int count = 0;
for (;;) {
/* find the next 0xff char */
/* note - this highlights that we really should know the size */
/* of the buffer we're searching, in case we don't find a match */
/* at the moment we're making it up to be some large number */
p1 = memchr(p1, 0xff, UINT_MAX);
if (p1 && (*(p1 + 1) == 0xd9)) {
/* found the 0xff 0xd9 sequence */
break;
}
p1 += 1;
}
return (char *) p1;
}
Also, note that you really should be passing in some notion of the size of the buffer being searched, in case the target isn't found.
Here's a version that takes a buffer size paramter:
char* searchBuffer(char* b, size_t siz) {
unsigned char* p1 = (unsigned char*) b;
unsigned char* end = p1 + siz;
for (;;) {
/* find the next 0xff char */
p1 = memchr(p1, 0xff, end - p1);
if (!p1) {
/* sequnce not found, return NULL */
break;
}
if (((p1 + 1) != end) && (*(p1 + 1) == 0xd9)) {
/* found the 0xff 0xd9 sequence */
break;
}
p1 += 1;
}
return (char *) p1;
}
You are falling foul of integer promotions. Both operands for !=
(and similar) are promoted to int
. And if at least one of them is unsigned
, then both of them are treated as unsigned
(actually that isn't 100% accurate, but for this particular situation, it should suffice). So this:
*p1 != (unsigned char)0xFF
is equivalent to:
(unsigned int)*p1 != (unsigned int)(unsigned char)0xFF
On your platform, char
is evidently signed
, in which case it can never take on the value of (unsigned int)0xFF
.
So try casting *p1
as follows:
(unsigned char)*p1 != 0xFF
Alternatively, you could have the function take unsigned char
arguments instead of char
, and avoid all the casting.
[Note that on top of all of this, your loop logic is incorrect, as pointed out in various comments.]
4E will promote itself to a positive integer but *p1
will be negative with FF, and then will be promoted to a very large unsigned value that will be far greater than FF.
You need to make p1
unsigned.
You can write the code a lot shorter as:
char* searchBuffer(const char* b) {
while (*b != '\xff' || *(b+1) != '\xd9') b++;
return b;
}
Also note the function will cause a segmentation fault (or worse, return invalid results) if b does not, in fact, contain the bytes FFD9.
use void *memmem(const void *haystack, size_t haystacklen, const void *needle, size_t needlelen);
which is available in string.h and easy to use.
char* searchBuffer(char* b, int len)
{
unsigned char needle[2] = {0xFF, 0XD9};
char * c;
c = memmem(b, len, needle, sizeof(needle));
return c;
}
精彩评论