Using reinterpret cast to save a struct or class to file
This is something the professor showed us in his scripts. I have not used this method in any code I have written.
Basically, we take a class, or struct, and reinterpret_cast it and save off the entire struct like so:
struct Account
{
Account()
{ }开发者_如何学C
Account(std::string one, std::string two)
: login_(one), pass_(two)
{ }
private:
std::string login_;
std::string pass_;
};
int main()
{
Account *acc = new Account("Christian", "abc123");
std::ofstream out("File.txt", std::ios::binary);
out.write(reinterpret_cast<char*>(acc), sizeof(Account));
out.close();
This produces the output (in the file)
ÍÍÍÍChristian ÍÍÍÍÍÍ ÍÍÍÍabc123 ÍÍÍÍÍÍÍÍÍ
I'm confused. Does this method actually work, or does it cause UB because magical things happen within classes and structs that are at the whims of individual compilers?
It doesn't actually work, but it also does not cause undefined behavior.
In C++ it is legal to reinterpret any object as an array of char
, so there is no undefined behavior here.
The results, however, are usually only usable if the class is POD (effectively, if the class is a simple C-style struct) and self-contained (that is, the struct doesn't have pointer data members).
Here, Account
is not POD because it has std::string
members. The internals of std::string
are implementation-defined, but it is not POD and it usually has pointers that refer to some heap-allocated block where the actual string is stored (in your specific example, the implementation is using a small-string optimization where the value of the string is stored in the std::string
object itself).
There are a few issues:
You aren't always going to get the results you expect. If you had a longer string, the
std::string
would use a buffer allocated on the heap to store the string and so you will end up just serializing the pointer, not the pointed-to string.You can't actually use the data you've serialized here. You can't just reinterpret the data as an
Account
and expect it to work, because thestd::string
constructors would not get called.
In short, you cannot use this approach for serializing complex data structures.
It's not undefined. Rather, it's platform dependent or implementation defined behavior. This is, in general bad code, because differing versions of the same compiler, or even different switches on the same compiler, can break your save file format.
This could work depending on the contents of the struct, and the platform on which the data is read back. This is a risky, non-portable hack which your teacher should not be propagating.
Do you have pointers or int
s in the struct? Pointers will be invalid in the new process when read back, and int
format is not the same on all machines (to name but two show-stopping problems with this approach). Anything that's pointed to as part of an object graph will not be handled. Structure packing could be different on the target machine (32-bit vs 64-bit) or even due to compiler options changing on the same hardware, making sizeof(Account)
unreliable as a read back data size.
For a better solution, look at a serialization library which handles those issues for you. Boost.Serialization is a good example.
Here, we use the term "serialization" to mean the reversible deconstruction of an arbitrary set of C++ data structures to a sequence of bytes. Such a system can be used to reconstitute an equivalent structure in another program context. Depending on the context, this might used implement object persistence, remote parameter passing or other facility.
Google Protocol Buffers also works well for simple object hierarchies.
It's no substitute for proper serialization. Consider the case of any complex type that contains pointers - if you save the pointers to a file, when you load them up later, they're not going to point to anything meaningful.
Additionally, it's likely to break if the code changes, or even if it's recompiled with different compiler options.
So it's really only useful for short-term storage of simple types - and in doing so, it takes up way more space than necessary for that task.
This method, if it works at all, is far from robust. It is much better to decide on some "serialized" form, whether it is binary, text, XML, etc., and write that out.
The key here: You need a function/code to reliably convert your class or struct to/from a series of bytes. reinterpret_cast
does not do this, as the exact bytes in memory used to represent the class or struct can change for things like padding, order of members, etc.
No.
In order for it to work, the structure must be a POD (plain old data: only simple data members and POD data members, no virtual functions... probably some other restrictions which I can't remember).
So if you wanted to do that, you'd need a struct like this:
struct Account {
char login[20];
char password[20];
};
Note that std::string's not a POD, so you'd need plain arrays.
Still, not a good approach for you. Keyword: "serialization" :).
Some version of string don;t actually use dynamic memory for the string when the string is small. Thus store the string internally in the string object.
Think of this:
struct SimpleString
{
char* begin; // beginning of string
char* end; // end of string
char* allocEnd; // end of allocated buffer end <= allocEnd
int* shareCount; // String are usually copy on write
// as a result you need to track the number of people
// using this buffer
};
Now on a 64 bit system. Each pointer is 8 bytes. Thus a string of less than 32 bytes could fit into the same structure without allocating a buffer.
struct CompressedString
{
char buffer[sizeof(SimpleString)];
};
stuct OptString
{
int type; // Normal /Compressed
union
{
SimpleString simple;
CompressedString compressed;
}
};
So this is what I believe is happening above.
A very efficient string implementation is being used thus allowing you to dump the object to file without worrying about pointers (as the std::string are not using pointers).
Obviously this is not portable as it depends on the implementation details of std::string.
So interesting trick, but not portable (and liable to break easily without some compile time checks).
精彩评论