开发者

Scanning a file and allocating correct space to hold the file

I am currently using fscanf to get space delimited words. I establish a char[] with a fixed size to hold each of the extracted words. How would I create a char[] with the correct number of spaces t开发者_如何学运维o hold the correct number of characters from a word? Thanks.

Edit: If I do a strdup on a char[1000] and the char[1000] actually only holds 3 characters, will the strdup reserve space on the heap for 1000 or 4 (for the terminating char)?


Here is a solution involving only two allocations and no realloc:

  1. Determine the size of the file by seeking to the end and using ftell.
  2. Allocate a block of memory this size and read the whole file into it using fread.
  3. Count the number of words in this block.
  4. Allocate an array of char * able to hold pointers to this many words.
  5. Loop through the block of text again, assigning to each pointer the address of the beginning of a word, and replacing the word delimiter at the end of the word with 0 (the null character).

Also, a slightly philosophical matter: If you think this approach of inserting string terminators in-place and breaking up one gigantic string to use it as many small strings is ugly, hackish, etc. then you probably should probably forget about programming in C and use Python or some other higher-level language. The ability to do radically-more-efficient data manipulation operations like this while minimizing the potential points of failure is pretty much the only reason anyone should be using C for this kind of computation. If you want to go and allocate each word separately, you're just making life a living hell for yourself by doing it in C; other languages will happily hide this inefficiency (and abundance of possible failure points) behind friendly string operators.


There's no one-and-only way. The idea is to just allocate a string large enough to hold the largest possible string. After you've read it, you can then allocate a buffer of exactly the right size and copy it if needed.

In addition, you can also specify a width in your fscanf format string to limit the number of characters read, to ensure your buffer will never overflow.

But if you allocated a buffer of, say 250 characters, it's hard to imaging a single word not fitting in that buffer.


char *ptr;    
ptr = (char*) malloc(size_of_string + 1);

char first = ptr[0];
/* etc. */
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜