开发者

Return length of string containing any characters but the given charset

I need to write a function which takes two char *, one containing a string and the other a set of characters, which returns the length of the string NOT containing any of the characters.

Example:

LenContainsAnyBut("abc", "def"); // returns 3
LenContainsAnyBut("abc", "b"); // returns 1
LenContainsAnyBut("x", "xyz"); // retu开发者_如何学Gorns 0
LenContainsAnyBut("", "xyz"); // returns 0

Here's my implementation:

unsigned int LenContainsAnyBut(const char *s, const char *search_chars) {
    unsigned int len = 0;

    while (*(s + len) != '\0' {
        for (const char *search_char = search_chars; *seach_char != '\0'; ++search_char) {
            if (*search_char == *(s + len)) {
                return len;
            }
        }

        ++len;
    }

    return len;
}

Anything to improve? I would prefer the "array notation", i.e. s[0] instead of s + 0 but it is not allowed in this assignment.

EDIT

Sry, somehow managed to totally screw my code >.<.


If you're working with 8-bit chars, you can avoid the nested loops. First make sure s and sc (search_chars) are of type unsigned char * (not plain char *!) then:

unsigned char set[32] = "";
size_t l=0;
for (; *sc; sc++) set[*sc/8] |= 1U<<*sc%8;
for (; *s; s++) l += 1-(set[*s/8]>>*s%8 & 1);


One thing to change is the fact that len is always 0 in your code, you should increment len after the for loop inside the while.

Other small mistake is that you're missing a 't' in the declaration of len (unsigned in should be unsigned int).

I also belive that in the for loop you are changing the pointer value itself, and that would make only the first character of s to be tested, when testing the other characters of s *search_chars will always be equal to "/0" try using an integer like len on the for loop as well


The code as posted doesn't even compile, and with the obvious fixes, it will enter an eternal loop.

That being said, I'd write this function using strchr().


If you want to improve your run time for long strings and/or long exclusion sets then you could try to take advantage of the ability to use characters as array indexes, and create an array represent the set of characters that are allowed/disallowed from your strings.

If you create an array of length 256, initialize it to 1 for all elements (except for element 0, since I think you have to assume that it is always excluded because there is no way to represent it in the exclusion string because that is a C string), and then loop through your exclusion set string, casting each character in it to unsigned (characters are signed on some systems, but they need to be unsigned for this to work) and set the byte indexed by that character to 0.

At the end of this you have a lookup table that allows you to tell very quickly if a character is the end of your string, and the run time is O(n+m) rather than O(n*m).


Using this homework section to improve my limited ability to program in C .

So apologies if the question has an obvious answer ,

but where in the code is the value of len getting incremented as characters in s are 'tested and passed'


I would simply implement it as:

#include <string.h>

unsigned int LenContainsAnyBut(const char *s, const char *search_chars)
{
    return strcspn(s, search_chars);
}

...but hey, that's just me ;)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜