开发者

Random bytes with fread

#post

The names of my variables are not imporant! This code will be deleted when it works!

#post

Alright, so I'm using fread in stdio.h to read a text file. The problem is that I keep reading random bytes that don't exist in the text file from my knowledge. I'm assuming they are part开发者_C百科 of files scheme, but I just wanna make sure it's not my code.

#include "stdafx.h"
#ifdef WIN32
    #include <io.h>
#else
    #include <sys/io.h>
#endif
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>

#include "n_script_timer.h"
//using namespace std;

#ifdef _INC_WCHAR
    typedef wchar_t CHR;
#else
    typedef char CHR;
#endif
int _tmain(int argc, CHR* argv[])
{
    #ifndef _DEBUG
        if(argc == 1)
        {
            printf("You must drag a file onto this program to run it.");
            scanf("%*c");
            return 0;
        }
        CHR* fname = argv[1];
    #else
        #ifdef _INC_WCHAR
            const CHR fname[16] = L"f:\\deleteme.bin";
        #else
            const CHR fname[16] = "f:\\deleteme.bin";
        #endif
    #endif

    FILE* inFile;
    long len;
    struct Script_Timer a;
    //static const int bsize = 4096*6;
    static const int bsize = 84;
    typedef CHR chhh[bsize];
    int alen;
    printf("#Opening File '%s' ...\n",fname);
    #ifdef _INC_WCHAR
        if((inFile = _wfopen(fname,L"rb")) == NULL)
    #else
        if((inFile = fopen(fname,"r")) == NULL)
    #endif
    {
        printf("Error opening file '%s' ",fname);
        return 0;
    }
    fseek(inFile,SEEK_SET,0);
    #ifdef _WIN32
        len = _filelength( inFile->_file );
    #else
        len = _filelength(inFile->_fileno);
    #endif
    printf("  !FileLength: %d\n",len);
    printf("#Creating Buffers...\n");
    if(((float)len/(float)bsize) > (len/bsize))
    {
        alen = (len/bsize) + 1;
    }
    else alen = (len/bsize);
    #ifdef WIN32
        //chhh *cha = new chhh[alen];
        chhh cha[alen];
    #else
        chhh cha[alen];
    #endif
    printf("#Reading File...\n");
    Start_ST(&a);
    int i = 0;
    for(i=0;i<alen;++i)
    {
        fread(&cha[i],sizeof(CHR),bsize,inFile);
        printf("[%i]%s",i,cha[i]);
    }
    End_ST(&a);
    fclose(inFile);
    printf("Characters per millisecond: %f \n",((float)len/a.milliseconds));
    printf("Characters per second: %f \n",((float)len/a.milliseconds) * 1000);
    scanf("%*c");
    return 0;
}


A couple of weird things here:

int i = 0;
for(i=0;i<alen;++i)
{
   fread(&cha[i],sizeof(CHR),bsize,inFile);
   printf("[%i]%s",i,cha[i]);
}
  1. You don't null terminate the buffer before printing it (as RageZ pointed out).

  2. You increment i on each loop repetition, but every time you read 84 chars (bsize) into &cha[i]. I think this should mean you're only seeing every 84th character.

Also, if I were you I'd be checking the return value of fread every time. It's not guaranteed to always return the number of bytes you're expecting.


EDIT: The size of the block you're reading is fine. I got confused for a minute by the typedef. Every time you increment i by 1, it advances the pointer by 84*sizeof(CHR), as you intended. Still, you can't guarantee that it read the number of bytes that you think it did. If it came up short then you'll be left with junk in the buffer: say it read 60 chars, that leaves 24 junk chars before the insertion point for the next read.


typedef CHR chhh[bsize];

but

fread(&cha[i], sizeof(CHR), bsize, inFile);

In C++, you need an extra byte for the '\0' at the end of a string.


the cha buffer should be filled with null (0) before or you gonna you going to get some garbage.

printf("[%i]%s",i,cha[i]);

Like printf is outputing to screen until it meets NULL, so in the best case you are going to have some garbage, worst some access violation because you access memory that you don't own.

Note: I advise you to give meaningful name to your variable/typedef etc like chhh is not really nice. It would be a pain in few month even for you to modify such code!


Note, your alen calculation is going to be wrong if you're using the wchar_t code path because bsize is the element count for the array, not its size in bytes.

I would suggest you try changing your variable names to accurately describe what they mean, you'll find it much easier to spot errors if you do.


You may also have buffer overrun errors.

int i = 0;
for(i=0;i {
fread(&cha[i],sizeof(CHR),bsize,inFile);
printf("[%i]%s",i,cha[i]);
}

In the above loop, you are reading a quantity of bsize at each position in the cha array. Unless bsize is one, you will have buffer overflow problems and the data in the array will not match the data in the file.

With Unicode, I don't think you can use binary I/O. Because Unicode uses more than one byte for representing characters, you run into byte ordering issues (Big Endian vs. Little Endian). If your machine architecture has the same Endianess as the Unicode specification, you will have no problems. But if the program is run on a different architecture...

As others have stated, don't develop specific code to handle the switching between Unicode and ASCII (8-bit). Look in the compiler manual and use methods that will operate on either Unicode or ASCII, depending on the compiler switch. Only write new code when the compiler or OS doesn't have the functionality you need. In this case, you need an fread that will operate on either; but definitely not fread.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜