开发者

How to calculate the length of a string in C efficiently?

How to calculate the length of a string in C efficiently (in time)?

Right now I'm doing:

int calculate_length(char *string) {
    int length = 0;
    while (string[length] != '\0') {
        length++;
    }
    return length;
}

But it's very slow compared to strlen() for example, is there any other way to do it?

Thanks.

E开发者_开发技巧DIT: I'm working in a freestanding environment, I'm not allowed to use any external lib including "string.h".


From the FreeBSD source code:

size_t
strlen(const char *str)
{
    const char *s;
    for (s = str; *s; ++s);
    return(s - str);
}

Compared to your code, this probably maps very nicely to an assembler instruction, which can explain a big performance difference.


strlen(). Odds are, if somebody had found a better, faster generic method, strlen would have been replaced with that.


Take a look at the source code of strlen in the standard libc. Functions in standard libraries are generally highly optimized. Check it out here (coded in assembly) - this is from the GNU libc.

size_t
DEFUN(strlen, (str), CONST char *str)
{
  int cnt;

  asm("cld\n"                   /* Search forward.  */
      /* Some old versions of gas need `repne' instead of `repnz'.  */
      "repnz\n"                 /* Look for a zero byte.  */
      "scasb" /* %0, %1, %3 */ :
      "=c" (cnt) : "D" (str), "0" (-1), "a" (0));

  return -2 - cnt;
}


Take a look at GNU C library's strlen() source.

It uses a number of non-obvious tricks to gain speed without dropping to assembly, including:

  • getting to a character that's properly aligned
  • reading those aligned parts of the string into an int (or some larger datatype) to read several chars at a time
  • using bit twiddling tricks to check if one of the chars embedded in that block of chars is zero

etc.


The easiest way is to call strlen(). Seriously. It's already optimized by your compiler and/or library vendors, to be as fast as possible for your architecture.

One common optimization is to remove the need to increase a counter, and compute the length from the pointer:

size_t my_strlen(const char *s)
{
  const char *anchor = s;

  while(*s)
   s++;

  return s - anchor;
}


C strings are intrinsically inefficient, there are two reasons for using the ASCIZ convention:

  • The standard C library uses it
  • The compiler uses it for literal string constants

The first of these is academic in this instance since you are not using the standard library, the second is easily overcome by creating functions or macros that provide conversions from C strings to a more efficient convention such as Pascal strings. The point is you need not be a slave to the C convention if you are not using the C library.


Yet another way to speed up char counting is to use vectorization!

Here's an example of how to do this with respect to UTF8-encoded strings:

Even faster UTF-8 character counting,

http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html


On i386 processors, libc often use an ultra-optimized version of strlen, often written in assembly language. The paper "String Length" explains how they work.

Here is one optimized version for OpenBSD. (They also have a portable version.) Here is the version for the GNU libc.


Some of the above answers are very good, and this is my take. There is a keyword known as "register"

#include <stdio.h>
size_t strlenNew(char *s);

int main(int argc, char* argv[])
{
    printf("Size of \"Hello World\" is ::\t%d",strlenNew("Hello World"));
    return 0;
}

size_t strlenNew(char *s)
{
    register int i=0;
    while(s[i]!='\0') i++;
    return i;
}

Read here: http://gustedt.wordpress.com/2010/08/17/a-common-misconsception-the-register-keyword/ and http://msdn.microsoft.com/en-us/library/482s4fy9(v=vs.80).aspx

From the first link:

This can be particularly useful for array variables. An array variable is easily confounded with a pointer variable. Unless it is followed by a [expr] or with a sizeof it evaluates to the address of the first element. If you declare the array register all these uses are forbidden; we only access individual elements or ask for the total size. Such an register-array then may be much easier used as if it just were a set of variable by the optimizer. No aliasing (accessing the same variable through different pointers) may occur.

Thus, sometimes, there may be performance fluctuations. Personally, this is one of my fav implementations, but Sudhanshu and Andomar also provide a good implementation :)


I had the same problem, and I resolved it. The key is the 2nd condition of the for loop:

int longitud(char cad[]){

    int i, cont;

    cont = 0;

    for(i = 0; i < 30 && cad[i] != '\0'; i++){
        if(cad[i] != '\0'){
            if(cad[i] != ' '){
                cont++;
            }
        }
    }
    cont--;
    return cont;
}


Basic C program to calculate string length.

#include <stdio.h>

/**
* Method to calculate string length.
* Returns -1 in case of null pointer, else return string length.
**/
int length(char *str) {

    int i = -1;
    // Check for NULL pointer, then return i = -1;
    if(str == NULL) return i;

    // Iterate till the empty character.
    while (str[++i] != '\0');
    return i;  // Return string length.
}

int main (int argc, char **argv) {

    int len = 0;
    char abc[] = "hello";
    len = length(abc);
    printf("%d", len);  
    return 0;
}

NOTE: For better way, we should always pass the array size to function to avoid the case of out of bound access. For example the prototype of method should be:

/**
* @desc calculate the length of str.
* @param1 *str pointer to base address of char array.
* @param2 size = capacity of str to hold characters.
* @return int -1 in case of NULL, else return string length.
**/
int length (char *str, int size);


I am not quite sure of what you want to do.

You want to re-write strlen to make your code compatible with standard c-Library, or you want to manage strings.

In the first case, I think you'd better directly use the standard libraries.

The other case is interesting : you should take a look at the c++ string class, that have an implementation of the traits strategy (allowing quick manipulations of very big strings).


I did not find better :

inline size_t mystrlen(char *_)

  { return ((_ == NULL) ? (_[0] != '\0')) ? 0 : (1 + mystrlen(_ + 1)); }


int max;
max = sizeof(str);
return (--max);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜