What's the fastest way to deserialize a tree in C++
I'm working with a not so small tree structure (it's a Burkhard-Keller-Tree, > 100 MB in memory) implemented in C++. The pointers to the children of each node are stored in a QHash.
Each node x has n children y[1] ... y[n], the edges to the children are labeled with the edit distance d(x, y[i]), so using a hash to store the nodes is an obvious solution.
class Node {
int value;
QHash<int, Node*> children;
/* ... */
};
I also want to serialize and deserialize it into a file (I currently use a QDataStream). The tree is just built once and doesn't change then.
Building the tree and deserializing it is rather slow. I'm loading the tree in the obvious way: Recursively building each node. I think this is suboptimal due to the many nodes that are created seperately with the new
operator. I read somewhere that new
is pretty slow. The initial build is not a big problem because the tree's rather stable and doesn't have to be rebuilt very often. But loading the tree from a file should be as fast as possible.
What's the best way to accomplish this?
It must be much better to save the whole tree in a single memory block with adjacent nodes. Serializing and deserializing would then be reduced to save and load the whole block, which I have to allocate just once.
But to implement this I would have to re-implement the QHash, AFAIK.
What would you do to speed up the deserialization?
Addendum
Thank you for your suggestion to do some profi开发者_JAVA技巧ling. Here are the results:
While rebuilding the tree from a file
1 % of the time is consumed by my own new calls
65 % is consumed by loading the QHash objects (this is implemented by the
Qt Library) of each node
12 % is consumed by inserting the nodes into the existing tree
20 % is everything else
So it's definitly not my new calls which cause the delay but the rebuild of the QHash objects at every node. This is basically done with:
QDataStream in(&infile);
in >> node.hash;
Do I have to dig into QHash and look what's going on under the hood there? I think the best solution would be a hash object that can be serialized with a single read and write operation without the need to rebuild the internal data structure.
First of all - profile your application so that you know what takes time - basing the suspicion on new because you've read somewhere it can be slow or on the iteration through the tree is not enough.
It's possible it's the IO operations - maybe your file format is not correct/inefficient.
Maybe you just have a defect somewhere?
Or maybe there's a quadratic loop somewhere that you don't remember about causing the problems? :)
Measure what really takes time in your case and then approach the issue - it'll save you a lot of time and you'll avoid breaking your design/code to fix performance issues that don't exist before finding the real cause.
Another approach would be to serialize your pointers and restore them when loading. I mean:
Serializing:
nodeList = collectAllNodes();
for n in nodelist:
write ( &n )
writeNode( n ) //with pointers as-they-are.
Deserializing:
//read all nodes into a list.
while ( ! eof(f))
read( prevNodeAddress)
readNode( node )
fixMap[prevNodeAddress] = &node;
nodeList.append(node);
//fix pointers to new values.
for n in nodeList:
for child in n.children:
child->node = fixMap[child->node]
This way if you don't insert-remove new nodes you can allocate a vector once and use that memory, reducing your allocation to the maps ( as rpg said, it might be faster with lists or even vectors).
I highly recommend the boost serialization library. It should work with the solutions you're using.
The absolute fastest way of serialising/deserialising is writing a block of contiguous memory to disk as you say. If you changed your tree structure to create this (probably using a custom allocation routine) this would be very easy.
Unfortunately I'm not that familiar with QHash, but from looking at it it looks like a Hashtable rather than a tree. Have I misunderstood you? Are you using this to map duplicate nodes?
I'd use a profiler (I used to use Quantify, now called Rational PurifyPlus, but there are a lot listed here) to find where you are using time, but I'd guess it is either multiple memory allocations rather than a single allocation, or multiple reads rather than a single read. To solve both these problems you know in advance (because you store it) how many nodes you need, then write/read an array of nodes of the correct length, where each pointer is an index into the array, rather than a pointer in memory.
Another solution would be to use your own memory allocator, which will use a continuous memory space. Then you'll be able to dump the memory as is and load it back. It's platform (i.e. big endian/little endian, 32bit/64bit) sensitive.
As you said, allocating objects with new might be slow. That can be improved allocating an object pool and then using pre-allocated objects until the pool is exhausted. You might even be able to implement this to work in background by overloading the new/delete operators of the class in question.
Your own memory allocation with an overloaded operator new() and delete() is a low-cost option (development time). However, this only affects the memory allocation time, and not the Ctor times. Your mileage may vary, but may worth a try.
I'll expand my comment a bit:
Since your profiling suggests that the QHash serialization takes the most time, I believe that replacing QHash with a QList would yield a significant improvement when it comes to deserialization speed.
The QHash serialization just outputs the key/value pairs, but the deserialization constructs a hash data structure!
Even if you said that you need the fast child lookup, I would recommend that you try replacing QHash with a QList > as a test. If there aren't many children for each node (say, less than 30), the lookup should still be fast enough even with a QList. If you find that QList is not fast enough, you could still use it just for (de)serializaton and later convert to a hash once the tree has been loaded.
精彩评论