开发者

Reading Strings with Undefined Length in C

first(as always) I want to apologize about my english, it may not be clear enough.

I'm not that good at C programming, and I was asked to read a "string" input with undefined length.

This is my solution

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *newChar();
char *addChar(char *, char);
char *readLine(void);

int main() {
  char *palabra;
  palabra = newChar();

  palabra = readLine();
  printf("palabra=%s\n", palabra);

  return 0;
}

char *newChar() {
  char *list = (char *) malloc(0 * sizeof (char));
  *list = '\0';
  return list;
}

char *addChar(char *lst, char num) {
  int largo = strlen(lst) + 1;
  real开发者_JS百科loc(&lst, largo * sizeof (char));
  *(lst + (largo - 1)) = num;
  *(lst + largo) = '\0';
  return lst;
}

char *readLine() {
  char c;
  char *palabra = newChar();

  c = getchar();
  while (c != '\n') {
    if (c != '\n') {
      palabra = addChar(palabra, c);
    }
    c = getchar();
  }
  return palabra;
}

Please, I'd appreciate that you help me by telling me if it's a good idea or giving me some other idea(and also telling me if it's a "correct" use for pointers).

Thanks in advance


EDIT: Well, thanks for you answers,they were very useful. Now I post edited(and I hope better) code, maybe could be useful for someone new to C(like me) and be feedbacked again.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


void reChar(char **, int *);
void readLine(char **, int *);

int main() {
    char *palabra = NULL;
    int largo = 0;

    reChar(&palabra, &largo);
    readLine(&palabra, &largo);
    printf("palabra=%s\n", palabra, largo);

    system("pause");
    return 0;
}

void reChar(char **lst, int *largo) {
    (*largo) += 4;
    char *temp = (char*) realloc(*lst, (*largo) * sizeof (char));

    if (temp != NULL) {
        *lst = temp;
    } else {
        free(*lst);
        puts("error (re)allocating memory");
        exit(1);
    }
}

void readLine(char **lst, int *largo) {
    int c;
    int pos = 0;

    c = getchar();
    while (c != '\n' && c != EOF) {
        if ((pos + 1) % 4 == 0) {
            reChar(lst, largo);
        }
        (*lst)[pos] =(char) c;
        pos++;
        c = getchar();
    }
    (*lst)[pos] = '\0';
}

PS:

  • It seem enough to slow increase size of "palabra".

  • I'm not sure if capture getchar() into a int and then cast it into a char is the correct way to hadle EOF pitfall


  1. Look up the definition of POSIX getline().

  2. Remember that you need to capture the return value from realloc(); it is not guaranteed that the new memory block starts at the same position as the old one.

  3. Know that malloc(0) may return a null pointer, or it may return a non-null pointer that is unusable (because it points to zero bytes of memory).

  4. You may not write '*list = '\0'; when list points to zero bytes of allocated memory; you don't have permission to write there. If you get a NULL back, you are likely to get a core dump. In any case, you are invoking undefined behaviour, which is 'A Bad Idea™'. (Thanks)

  5. The palabra = newChar(); in main() leaks memory - assuming that you fix the other problems already discussed.

  6. The code in readLine() doesn't consider the possibility of getting EOF before getting a newline; that is bad and will result in a core dump when memory allocation (finally) fails.

  7. Your code will exhibit poor performance because it allocates one character at a time. Typically, you should allocate considerably more than one extra character at a time; starting with an initial allocation of perhaps 4 bytes and doubling the allocation each time you need more space might be better. Keep the initial allocation small so that the reallocation code is properly tested.

  8. The return value from getchar() is an int, not a char. On most machines, it can return 256 different positive character values (even if char is a signed type) and a separate value, EOF, that is distinct from all the char values. (The standard allows it to return more than 256 different characters if the machine has bytes that are bigger than 8 bits each.) (Thanks) The C99 standard §7.19.7.1 says of fgetc():

    If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).

    (Emphasis added.) It defines getchar() in terms of getc(), and it defines getc() in terms of fgetc().

  9. (Borrowed: Thanks). The first argument to realloc() is the pointer to the start of the currently allocated memory, not a pointer to the pointer to the start of the currently allocated memory. If you didn't get a compilation warning from it, you are not compiling with enough warnings set on your compiler. You should turn up the warnings to the maximum. You should heed the compiler warnings - they are normally indicative of bugs in your code, especially while you are still learning the language.

  10. It is often easier to keep the string without a null terminator until you know you have reached the end of the line (or end of input). When there are no more characters to be read (for the time being), then append the null so that the string is properly terminated before it is returned. These functions do not need the string properly terminate while they are reading, as long as you keep track of where you are in the string. Do make sure you have enough room at all times to add the NUL '\0' to the end of the string, though.

See Kernighan & Pike 'The Practice of Programming' for a lot of relevant discussions. I also think Maguire 'Writing Solid Code' has relevant advice to offer, for all it is somewhat dated. However, you should be aware that there are those who excoriate the book. Consequently, I recommend TPOP over WSC (but Amazon has WSC available from $0.01 + p&p, whereas TPOP starts at $20.00 + p&p -- this may be the market speaking).


TPOP was previously at http://plan9.bell-labs.com/cm/cs/tpop and http://cm.bell-labs.com/cm/cs/tpop but both are now (2015-08-10) broken. See also Wikipedia on TPOP.


  • You are always allocating one byte less than you are using. For example in the beginning you allocate space for zero characters and then try to set the (non-existing) first character to '\0'.

  • realloc doesn't take a pointer to a pointer as first parameter. It's supposed to be used like this:

    lst = realloc(lst, largo * sizeof (char));
    
  • If you want to handle out-of-memory conditions you would have to check if malloc() or realloc() returned NULL.

  • It would be more efficient to allocate a bigger buffer in the beginning and increase it in bigger steps instead of reallocating every added character separately.


The first argument to the call to realloc in

realloc(&lst, largo * sizeof (char));

should be lst and not &lst

Also the pointer returned by realloc need not always be same as its first argument. If no free memory adjacent to the existing memory is found, a completely different chunk is allocated and its address is returned.

char *new_lst = realloc(lst, largo * sizeof (char));
if(new_lst != NULL) {
  lst = new_lst;
}


Apart from the errors in your code, I think it is better to create a variable-length string in C. Once you have that, you can write a getLine() function. This variable-length string includes the concept of capacity, so its size increases in blocks of powers of 2, instead one by one.

#include <string.h>
#include <stdio.h>

typedef struct _mystring {
    char * native;
    size_t size;
    size_t capacity;
} String;

size_t String__len(String this)
{
    return this.size;
}

String String__create(char native[], size_t capacity) {
  String this;

  this.size = strlen( native );
  if ( capacity < ( this.size + 1 ) )
        this.capacity = this.size + 1;
  else  this.capacity = capacity;

  this.native = (char *) malloc( capacity * sizeof( char ) );
  strcpy( this.native, native );

  return this;
}

String * String__set(String *this, char native[]) {
    this->size = strlen( native );

    if ( this->size >= this->capacity ) {
        do {
            this->capacity <<= 1;
        } while( this->size > this->capacity );

        this->native = realloc( this->native, this->capacity );
    }

    strcpy( this->native, native );

    return this;
}

String * String__add(String *this, char ch) {
    ++( this->size );

    if ( this->size >= this->capacity ) {
        do {
            this->capacity <<= 1;
        } while( this->size > this->capacity );

        this->native = realloc( this->native, this->capacity );
    }

    char * zeroPos = this->native + ( this->size -1 );
    *( zeroPos++ ) = ch;
    *zeroPos = 0;

    return this;
}

void String__delete(String *this)
{
    free( this->native );
}

Once you have this implementation done, which is useful for this problem and many others, you can create the getLine function:

String String__getLine()
{
    int ch;
    String this = String__create( "", 16 );

    do {
        ch = fgetc( stdin );
        String__add( &this, ch );
    } while( ch != EOF
          && ch != '\n' );

    size_t len = String__len( this );
    this.size = len -1;
    *( this.native + this.size ) = 0;

    return this;
}

Now you can just use it:

int main()
{
    printf( "Enter string: " );
    String str = String__getLine();
    printf( "You entered: '%s'\n", str.native );
    String__delete( &str );

    return EXIT_SUCCESS;
}


Here a working example for realloc and fgets. Its C89, no POSIX needed. You can set the parameter with your own preallocated memory or NULL. A terminating "free" is always needed.

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

char *getstringStdin(char *s)
{
  char buffer[9];
  s=realloc(s,1);
  *s=0;
  while( fgets(buffer,9,stdin) )
  {
    s=realloc(s,strlen(s)+1+strlen(buffer));
    strcat(s,buffer);
    if( strchr(s,'\n') )
    {
      *strchr(s,'\n')=0;
      break;
    }
  }
  return s;
}

main()
{
  char *s;
  while( *(s=getstringStdin(0)) ) /* a single Enter breaks */
  {
    puts(s);
    free(s);
  }
  free(s);
  puts("end");
  return 0;
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜