Application-specific data and how to handle it?
I am curious as to how applications generate their own data that is used with the application itself. For example, if you take any kind of PC game's save file or some sort of program that generates binary data like Photoshop's PSD files or .torrent files for BitTorrent applications, I'd assume they are all specific to the corresponding application and that the authors of that application programmed the way this data was created. My first question is: is that true? I am 99% positive that it is binary data because when opening a PSD file or a .torrent file in Notepad++, it's easy to see that it's nothing that can be read by a human...
My second question is: if I wanted to make an application that generates its own data in binary format (no plain-text or anything that's easily manipulated), how would I go about handling this data? I can vaguely picture generating this data and saving it 开发者_如何学编程to a file in binary format, but I am really stuck on how I'd handle this data when it's needed by the application again. Since this type of data is not plain text and can't be treated as a string or anything like that, how is it that applications create and handle/parse their own binary data (or any binary data in general)?
I can obviously see that when you open a PSD file, Photoshop opens and it displays whatever the PSD file contained. But how do many applications handle these formats? I am just not seeing how to parse this specific data (or binary data in general) and programmatically do what you want to with it.
Well, as a simple example, let's take bitmaps.
Bitmaps have a standard file structure, which is defined by the info header and file header.
On the wikipedia article (link: http://en.wikipedia.org/wiki/BMP_file_format) you'll see that the info header has a well defined format, as well as the file header.
Each of these is written as binary as is, and is read in as binary as is. Then, the actually bitmap image is written out as binary.
In other applications, the application may choose to do a custom plain text format, in which case it must be written to in a consistent manner or have some support for versioning so you can use newer features in the file.
Look up on serialization though, it's a rather broad topic and there are lots of approaches to this.
Edit: Here is a code sample (not optimal) for reading (or writing, with the right modifications) in bitmaps:
// Tell visual studio to align on 2-byte boundary
// Necessary so if you write to file, it only writes 14 bytes and not 16.
#pragma pack(2)
struct BMIH
{
short bfType;
long bfSize;
short bfReserved0;
short bfReserved1;
long bOffbits;
};
#pragma pack(8)
struct BMFH
{
long biSize;
long biWidth;
long biHeight;
short biPlanes;
short biBitCount;
long biCompression;
long biImageSize;
long biXPelsPerMeter;
long biYPelsPerMeter;
long biClrUsed;
long biClrImportant;
};
BMIH infoheader;
BMFH fileheader;
std::fstream file(filename.c_str(), std::ios::in | std::ios::binary);
// Read in info and file headers
file.read((char *) &infoheader, sizeof(infoheader));
file.read((char *) &fileheader, sizeof(fileheader));
// Calculate size of image
int size = fileheader.biHeight * fileheader.biWidth;
int bytes = size * fileheader.biBitCount / 8;
// Read in the image to a buffer
unsigned char data = new unsigned char[bytes];
file.read((char *) td.data, bytes);
file.close();
That code is actually a drastic simplification and completely ignores all sorts of issues, such as what happens if the file headers or data are corrupt, if the file isn't incomplete, etc. But it's just meant as a proof of concept. The #pragmas are actually visual studio specific for enforcing proper alignment of the headers.
When we write this out to a file, we might not actually say "Okay, now write out this integer". Instead, we want to write it as a binary format. For example, code that you might (but shouldn't) use to write it would look like:
// Assume for arguments sake these data structures came pre-filled
BMFH fileheader;
BMIH infoheader;
unsigned char *data;
int size = fileheader.biHeight * fileheader.biWidth;
int bytes = size * fileheader.biBitCount / 8;
std::fstream file("MyImage.bitmap", std::ios::out | std::ios::binary);
file.write((char *) &infoheader, sizeof(BMIH));
file.write((char *) &fileheader, sizeof(BMFH));
file.write((char *) data, sizeof(unsigned char) * bytes);
Read up on Binary Serialization on MSDN. The .Net Framework goes a long way to helping with this.
Yes, Many applications leverage some sort of application-specific binary formats that can not be easily manipulated. To create your own binary format, there are some options:
- Binary Serialization Technique
- Using IO classes to manually read and write bytes and actually creating a random access file.
精彩评论