Finding line size of each row in a text file
How can you count the number of characters or numbers in开发者_运维问答 each line? Is there something like a EOF thats more like a End of Line?
You can iterate through each character in the line and keep incrementing a counter until the end-of-line ('\n'
) is encountered. Make sure to open the file in text mode ("r"
) and not binary mode ("rb"
). Otherwise the stream won't automatically convert different platforms' line ending sequences into '\n'
characters.
Here is an example:
int charcount( FILE *const fin )
{
int c, count;
count = 0;
for( ;; )
{
c = fgetc( fin );
if( c == EOF || c == '\n' )
break;
++count;
}
return count;
}
Here's an example program to test the above function:
#include <stdio.h>
int main( int argc, char **argv )
{
FILE *fin;
fin = fopen( "test.txt", "r" );
if( fin == NULL )
return 1;
printf( "Character count: %d.\n", charcount( fin ) );
fclose( fin );
return 0;
}
Regarding reading a file line by line, look at fgets.
char *fgets(char *restrict s, int n, FILE *restrict stream);
The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.
The only problem here may be if you can't guarantee a maximum line size in your file. If that is the case, you can iterate over characters until you see a line feed.
Regarding end of line:
Short answer: \n
is the newline character (also called a line feed).
Long answer, from Wikipedia:
Systems based on ASCII or a compatible character set use either LF (Line feed, 0x0A, 10 in decimal) or CR (Carriage return, 0x0D, 13 in decimal) individually, or CR followed by LF (CR+LF, 0x0D 0x0A); see below for the historical reason for the CR+LF convention. These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer, and a carriage return indicated that the printer carriage should return to the beginning of the current line.
* LF: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD, etc.), BeOS, Amiga, RISC OS, and others
* CR+LF: DEC RT-11 and most other early non-Unix, non-IBM OSes, CP/M, MP/M, DOS, OS/2, Microsoft Windows, Symbian OS
* CR: Commodore 8-bit machines, Apple II family, Mac OS up to version 9 and OS-9
But since you are not likely to be working with a representation that uses carriage return only, looking for a line feed should be fine.
If you open a file in text mode, i.e., without a b
in the second argument to fopen()
, you can read characters one-by-one until you hit a '\n'
to determine the line size. The underlying system should take care of translating the end of line terminators to just one character, '\n'
. The last line of a text file, on some systems, may not end with a '\n'
, so that is a special case.
Pseudocode:
count := 0
c := next()
while c != EOF and c != '\n'"
count := count + 1
the above will count the number of characters in a given line. next()
is a function to return the next character from your file.
Alternatively, you can use fgets()
with a buffer:
char buf[SIZE];
count = 0;
while (fgets(buf, sizeof buf, fp) != NULL) {
/* see if the string represented by buf has a '\n' in it,
if yes, add the index of that '\n' to count, and that's
the number of characters on that line, which you can
return to the caller. If not, add sizeof buf - 1 to count */
}
/* If count is non-zero here, the last line ended without a newline */
The original question was how to get the number of characters in "each line" (given a line? or the current line?), while the answers have mostly given solutions how to determine the length of the first line in a file. One can easily apply some of them to determine length of current line (without guessing beforehand maximum length for a buffer).
However, what one often needs in practice is the maximum length of any line in a file. Then one can reserve a buffer and use fgets to read the file line by line and use some nice functions (strtok, strtod etc.) to parse lines. In practice, you can use any of the previous solutions to determine length of one line, and just scan through all lines and take the maximum.
An easy script that reads the file character by character:
max=0; i=0;
do
if ((c=fgetc(f))!= EOF && c!='\n') i++;
else {
if (i>max) max=i;
i=0;
}
while (c!=EOF);
return max;
Note: In practice, it would suffice to have an upperbound for the maximum length. A dirty solution would be to use the file size as an upperbound for the maximum length of lines.
\n
is the newline character in C. In other languages, such as C#, you may use something like C#'s Environment.EndLine
to overcome platform difficulties.
If you already know that your string is one line (let's call it line), use strlen(line)
to get the number of characters in it. Subtract 1 if it ends with the '\n'
.
If the string has new line characters in it, you'll need to split it around the new line characters and then call strlen()
on each substring.
Here is a Simple Algorithm :
You require
- File Stream (FILE),
- Line Number , which you want size of (int)
Returns
- Total Characters in given line
Function :
#include <stdio.h>
#include <string.h>
int getLengthOfLine(FILE* df,int Ofline){
char cchar;
int line=1;
int total =1;
int atLine=0;
int afterLine=0;
while ((cchar=fgetc(df))!=EOF)
{
if (feof(df)){
break ;
}
if (cchar == '\n' || cchar == '\0'){
if(line==Ofline){
// printf(" before %d ",total);
atLine = total;
}
if(line==(Ofline+1)){
// printf(" after %d ",total);
afterLine = total-atLine;
}
// printf(" line is %d ",line);
line++;
}
total++;
}
fseek(df, 0L, SEEK_SET);
if(afterLine==0){
return (total-atLine-1);
}
else
{
return (afterLine-1);
}
}
Uses :
FILE* fp = fopen("path-to-file" , "r");
if(fp!=NULL){
printf(" %d",getLengthOfLine(fp,5));
}
精彩评论