开发者

Can I please get some feedback for this strcmp() function I implemented in C?

I'm learning C.

I find I learn programming well when I try things and received feedback from established programmers in the language.

I decided to write my own strcmp() function, just because I thought I could :)

int strcompare(char *a, char *b) {
    while (*a == *b && *a != '\0') {
        a++;
        b++;
    }
    return *a - *b;
}

I was trying to get it to work by increment开发者_开发百科ing the pointer in the condition of the while but couldn't figure out how to do the return. I was going for the C style code, of doing as much as possible on one line :)

Can I please get some feedback from established C programmers? Can this code be improved? Do I have any bad habits?

Thanks.


If you want to do everything in the while statement, you could write

while (*a != '\0' && *a++ == *b++) {}

I'm not personally a huge fan of this style of programming - readers need to mentally "unpack" the order of operations anyway, when trying to understand it (and work out if the code is buggy or not). Memory bugs are particularly insidious in C, where overwriting memory one byte beyond or before where you should can cause all sorts of inexplicable crashes or bugs much later on, away from the original cause.

Modern styles of C programming emphasize correctness, consistency, and discipline more than terseness. The terse expression features, like pre- and post-increment operations, were originally a way of getting the compiler to generate better machine code, but optimizers can easily do that themselves these days.

As @sbi writes, I'd prefer const char * arguments instead of plain char * arguments.


  1. The function doesn't change the content of a and b. it should probably announce that by taking pointers to const strings.
  2. Most C styles are much terser than many other languages' styles, but don't try to be too clever. (In your code, with several conditions ANDed in the loop conditions, I don't think there's way to put incrementing in there, so this isn't even a question of style, but of correctness.)


I don't know since when putting as much as possible is considered as C-style... I rather associate (obfuscated) Perl with that..

Please DO NOT do this. The best thing to do is one command per line. You will understand why when you try to debug your code :)

To your implementation: Seems quite fine to me, but I would put in the condition that *b is not '\0' either, because you can't know that a is always bigger than b... Otherwise you risk reading in unallocated memory...


You may find this interesting, from eglibc-2.11.1. It's not far different to your own implementation.

/* Compare S1 and S2, returning less than, equal to or
   greater than zero if S1 is lexicographically less than,
   equal to or greater than S2.  */
int
strcmp (p1, p2)
     const char *p1;
     const char *p2;
{
  register const unsigned char *s1 = (const unsigned char *) p1;
  register const unsigned char *s2 = (const unsigned char *) p2;
  unsigned reg_char c1, c2;

  do
    {
      c1 = (unsigned char) *s1++;
      c2 = (unsigned char) *s2++;
      if (c1 == '\0')
    return c1 - c2;
    }
  while (c1 == c2);

  return c1 - c2;
}


A very subtle bug: strcmp compares bytes interpreted as unsigned char, but your function is interpreting them as char (which is signed on most implementations). This will cause non-ascii characters to sort before ascii instead of after.


This function will fail if limits of (insigned) char is equal to or greater than limits of int because of integer overflow.

For example if you compile it on DSP which have 16 bit char with limits 0...65536 and 16 bit int with limits of -32768...32767, then if you try to compare strings like "/uA640" and "A" the result will be negative, which is not true.

This is exotic and weird problem but it appear when you write universal implementation.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜