开发者

XOR on a very big file

I would like to XOR a very big file (~50 Go).

More precisely, I would like to do so by XORing each block of 32 bytes of a plaintext file (because of lack of memory) with the key 3847611839 and create (block after block) a new cipher file.

开发者_JAVA技巧Thank You for any help!!


This sounded like fun, and doesn't sound like a homework assignment.

I don't have a previously xor-encrypted file to try with,but if you convert one back and forward, there's no diff.

That I tried atleast. Enjoy! :) This xor's every 4 bytes with 0xE555E5BF, I presume that's what you wanted.

Here's bloxor.c

// bloxor.c - by Peter Boström 2009, public domain, use as you see fit. :)

#include <stdio.h>

unsigned int xormask = 0xE555E5BF; //3847611839 in hex.

int main(int argc, char *argv[])
{
    printf("%x\n", xormask);
    if(argc < 3)
    {
        printf("usage: bloxor 'file' 'outfile'\n");
        return -1;
    }

    FILE *in = fopen(argv[1], "rb");
    if(in == NULL)
    {
        printf("Cannot open: %s", argv[2]);
        return -1;
    }

    FILE *out = fopen(argv[2], "wb");

    if(out == NULL)
    {
        fclose(in);
        printf("unable to open '%s' for writing.",argv[2]);
        return -1;
    }
    char buffer[1024]; //presuming 1024 is a good block size, I dunno...

    int count;

    while(count = fread(buffer, 1, 1024, in))
    {
        int i;
        int end = count/4;
        if(count % 4)
            ++end;

        for(i = 0;i < end; ++i)
        {
            ((unsigned int *)buffer)[i] ^= xormask;
        }
        if(fwrite(buffer, 1, count, out) != count)
        {
            fclose(in);
            fclose(out);

            printf("cannot write, disk full?\n");

            return -1;
        }
    }

    fclose(in);
    fclose(out);

    return 0;
}


As starblue mentioned in a comment, "Be aware that this is at best obfuscation, not encryption". And it's probably not even obfuscation.

One property of XOR is that (Y xor 0) == Y. What this means for your algorithm is that for anyplace in your very big file where there are runs of zeros (which seems pretty likely given the size of the file), your key will show up in the cipher file. Plain as day.

Another nice feature of XOR encrypted stuff is that if someone has both the plaintext and the cipher text, XOR'ing those items together nets you an output that has the key used to perform the cipher repeated over and over. If the person knows that the 2 files are a plaintext/ciphertext pair, they've learned the key which is bad if the key is used for more than one encryption. if the attacker isn't sure if the plaintext and ciphertext are related, they have a pretty good idea after this since the key is a repeated pattern in the output. None of this is a problem with one time pad because each bit of the key is used only once, so one one learns anything new from this attack.

A lot of people make the mistake of assuming that because a one time pad is provably unbreakable, that an XOR encryption might be OK 'if done well' since the fundamental operation performed is the same. The difference is that a one time pad uses each random bit of the key exactly once. So among other things, if the plaintext has a run of zeros, nothing is learned about the key, unlike with a simple fixed-key XOR cipher.

As Bruce Schneier said: "There are two kinds of cryptography in this world: cryptography that will stop your kid sister from reading your files, and cryptography that will stop major governments from reading your files."

An XOR cipher is barely kid sister proof - if even that.


You need to craft a solution around a streaming architecture: you read the input file in "stream", modify it, and write the result in the output file.

This way, you don't have to read all the file at once.


If your question is how to do it without using extra space on the disk, I would just read in the chunks in multiples of 32 bytes (as big as you can), work with the chunk in memory, then write it out again. You should be able to use the ftell and fseek functions to do that (assuming your long type is large enough, of course).

It may be faster to memory-map the file if you can spare that much out of your address space (and your OS supports it) but I'd try the easiest solution first.

Of course, if space isn't a problem, just read the chunks in and write them to a new file, something like the following (pseudo-code):

open infile
open outfile
while not end of infile:
    read chunk from file
    change chunk
    write chunk to outfile
close outfile
close infile

This sort of read/process/write is pretty basic stuff. If you have more complicated requirements, you should update your question with them.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜