开发者

One pass multiple whitespace replace to single whitespace and eliminate leading and trailing whitespaces

void RemoveSpace(char *String)
{
    int i=0,y=0;
    int leading=0;

    for(i=0,y=0;String[i]!='\0';i++,y++)
    {
        String[y]=String[i];    // let us copy the current character.

        if(isspace(String[i]))  // Is the current character a space?
        {
            if(isspace(String[i+1])||String[i+1]=='\0'||leading!=1) // leading space
          开发者_开发百科      y--;
        }
        else
            leading=1;
    }
    String[y]='\0';
}

Does this do the trick of removing leading and trailing whitespaces and replacing multiple whitespaces with single ones ?? i tested it for null string, all whitespaces, leading whitespaces and trailing whitespaces.

Do you think this is an efficient one pass solution ??


Does this do the trick of removing leading and trailing white spaces and replacing multiple white spaces with single ones?

The best way to answer that question is to test it.

void Test(const char *input, const char *expected_output) {
    char buffer[80];
    strcpy(buffer, input);
    RemoveSpace(buffer);
    assert(strcmp(buffer, expected_output) == 0);
}

int main() {
    Test("   Leading spaces removed.", "Leading spaces removed.");
    Test("Trailing spaces removed.  ", "Trailing spaces removed.");
    Test("Inner spaces      trimmed.", "Inner spaces trimmed.");
    Test(" A   little of  everything.  ", "A little of everything.");
    Test(" \tTabs \t\tare  \t spaces, too.", "Tabs are spaces, too.");
    return 0;
}

The code in the OP does not pass the last test, so the answer is no.

Do you think this is an efficient one pass solution?

It is a one-pass solution. If you're trying to squeeze every ounce of efficiency out, then you want to minimize the number of operations and conditional branches.

When working with C strings in C, it's often most idiomatic to use pointers rather than indexing. Depending on the compiler and the target platform, using pointers may be more or less efficient than indexing, but both are very low cost. Since this is already a single, linear pass, the best thing to do is to write it as clearly as you can, using idiomatic code patterns.

Here's my solution:

#include <assert.h>
#include <ctype.h>
#include <string.h>

void RemoveSpace(char *string) {
    char *target = string;
    char *last = target;
    int skipping_spaces = 1;

    for (const char *source = string; *source != '\0'; ++source) {
        if (isspace(*source)) {
            if (!skipping_spaces) {
                *target++ = *source;
                skipping_spaces = 1;
            }
        } else {
            *target++ = *source;
            last = target;
            skipping_spaces = 0;
        }
    }
    *last = '\0';
}

It's essentially a little state machine, which means that, at each step, we decide what to do based on the current input character and the current state. For this problem, our state is simply whether or not we're currently skipping spaces (and a bookmark keeping track of the last legal point to end the string).


Following code should do it :

void rem_space(char *str)
{
    int len = strlen(str) - 1;
    int i = 0;
    int spaces = 0;

    if(str == NULL) return;
    while(i < len){
        while(str[i] == ' ') {spaces++; i++;}
        while(str[i] != ' ' && str[i] != '\0') {str[i - spaces] = str[i]; i++;}
        if(str[i + spaces - 1] != '\0') {
            str[i - spaces] = ' '; spaces--; 
        } else {
            break;
        }
    }
    str[i - spaces] = '\0';
    return;
}


based on your code, assume your iswhite() is efficient, (which you might not make it separate, as it is called too frequent) assume the passed in string is valid itself (should be more defensive)

=======================

void RemoveSpace(char *String)
{

    int i=0, j=0;

    int inWhite=0;

    char c = String[i++];
    while(c)
    {
        if (isspace(c))
        {
            inWhite= 1;
        }
        else
        {
            // there are space before, and not beginning
            if (inWhite && j > 0)
            {
                String[j++] = ' ';
            }
            String[j++] = c;
            inWhite = 0;
        }

        c = String[i++];
    }
    String[j]='\0';
}

not tested, please test yourself...


First of, obviously, it's one-pass. However, it has a problem when input has more than one leading spaces. For example:

Input: " text" Output: " text"

Fortunately, it's easy to fix it. You just need an extra loop:

void RemoveSpace(char *string)
{
        int i = 0, y = 0;
        while(isspace(string[i]))         // Discard leading spaces.
                i++;
        for(y = 0; string[i]!='\0'; i++)
        {
                string[y] = string[i];    // let us copy the current character.

                if(!isspace(string[i]) || !isspace(string[i+1]) && !string[i+1]=='\0')
                        y++;              // This character shall not be covered.
        }
        string[y] = '\0';
}

I also made some changes to make your code looks better, which are actually irrelevant.


'''

python... 3.x

import operator ...

line: line of text

return " ".join(filter(lambda a: operator.is_not(a, ""), line.strip().split(" "))) '''

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜