开发者

How to rebuild a BST from a file

My C++ program creates an unbalanced BST from user input, and saves it do disk. It does this by first indexing each node by doing a pre-order traversal and assigning each node a unique number. Next, it outputs the BST to a file. It does this by doing a pre-order traversal and then for each node printing to file the data value, the index number of the left child, and the index number of the right child.

So after the BST is saved to disk, the BST in memory is destroyed. I want to read in the file and recreate the BST, so开发者_如何转开发 that it will be exactly as it was before.


Read all nodes into memory along with left and right child index (say hashtable). Then start from root (which is the first node in the file) and find the left and right child by index from hashtable. Repeat the same process for children nodes in DFS or BFS manner. Either should work.

You can optimize this to get rid of loading entire data into memory before constructing the tree. You can read the nodes and construct the tree in DFS manner. So adding left child is straight forward. While adding right child you have to check the index number. If the dont match, try to add that child in node higher than current node in DFS ordering.


Postorder would be simpler. You can just push restored nodes onto a stack and the children to be linked to the parent would always be the topmost entries of the stack.

You would only need one pass as well, as long as you recorded with each node which children were saved so you knew what to pop off the stack and which children to assign.


If you store a binary search tree as a pre-order traversal, then the tree that you'd get simply by inserting the elements one at a time as you read them out would be the same tree as you started with. That would of course require O(n log n) time. If you're willing to store the external nodes (Nulls) Then you could do something like the following:

ReadBSTPreOrder(node ** target) {

    node * n = readNode();
    *target = n;

    if (n == NULL) return;
    ReadBSTPreOrder(&node->left);
    ReadBSTPreOrder(&node->right);
}

This has the added advantage of working for binary trees that aren't BST's as well. Storing the Nulls can be a single bit if you're willing to monkey around with the representation, but a single byte tag for a null and a different tag for a record should do. That'll save you from writing out the indices as well.


Let's assume you know the size of the tree (N) up-front. Maybe it's the first line in the file. If not, it's easy to adapt this to dynamically re-size the index vector. Note that this is semi-pseudocode:

// Only needed while parsing the file
std::vector<Node*> index(N, NULL);

// We can always create the root node.
// This simplifies the while loop below.
index[0] = createNode(0);

while (!in.eof()) {
    int nodeID = -1, leftID = -1, rightID = -1;
    parseNode(in, &nodeID, &leftID, &rightID);

    // Guaranteed to be non-NULL
    Node* node = index[nodeID];

    // if leftID or rightID is -1, createNode()
    // will simply return NULL.
    index[leftID] = createNode(leftID);
    index[rightID] = createNode(rightID);

    node->setLeftChild(index[leftID]);
    node->setRightChild(index[rightID]);
}

The reason node = index[nodeID] is guaranteed to be non-NULL is because of the pre-order indexing/write/read. We pre-created the root node at the start, and all other nodes are created by the parent as a left or right child.

EDIT: I just realized we don't need a full master index. We only need to store potential right-child nodes to expand in a stack. Here's that version of the algorithm:

// Candidate right-child nodes to expand
std::stack<Node*> rightNodes;

// Pre-create the root node as "left child"
Node* left = createNode(0);

while (!in.eof()) {
    // We already know the next node. It is the previous
    // node's left child (or root), or the nearest
    // parent's right child.
    Node* node;

    if (left != NULL) {
        node = left;
    }
    else {
        node = rightNodes.top();
        rightNodes.pop();
    }

    parseLine(in, &nodeID, &leftID, &rightID);
    assert(node->ID() == nodeID);

    // if leftID or rightID is -1, createNode()
    // will simply return NULL.
    left = createNode(leftID);
    Node* right = createNode(rightID);

    node->setLeftChild(left);
    node->setRigthChild(right);

    if (right != NULL) {
        rightNodes.push(right);
    }
}


Assuming you're not going for the best bound you could use this simple algorithm.

nodes_list = []
For each line i:
    Search for node i in nodes_list
    if(found):
        node_i = found_node
        node_i.set_value(line_value)
    else:
        node_i = new node(i)
        node_i.set_value(line_value)
        nodes_list.add(node_i)
    Search for line_left_node in nodes_list
    if(found):
        node_i.set_left_node(found_node)
    else:
        left_node = new node(line_left_node)
        node_i.set_left_node(left_node)
        nodes_list.add(left_node)
    Search for line_right_node in nodes_list
    if(found):
        node_i.set_right_node(found_node)
    else:
        right_node = new node(line_right_node)
        node_i.set_right_node(right_node)
        nodes_list.add(right_node)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜