Finding the largest subtree in a BST
Given a binary tree, I want to find out the largest subtree which is a BST in it.
Naive approach:
I have a naive approach in mind where I visit every node of the tree and pass this node to a isBST function. I will also keep track of the number of nodes in a sub-tree if it is 开发者_如何学编程a BST.
Is there a better approach than this ?
I have posted the full solution and explanation in my blog:
http://www.leetcode.com/2010/11/largest-binary-search-tree-bst-in.html
The idea is to do a depth-first traversal and pass min and max values bottom-up instead of top-down. In other words, we verify the deeper nodes before we verify if the above nodes satisfy the BST requirements.
The main reason of doing this is when one of the nodes does not satisfy the BST properties, all subtrees above (which includes this node as well) must also not satisfy the BST requirements.
As compared to the top-down approach, the bottom-up approach is such an awesome choice because the results for total number of nodes could be passed up the tree. This saves us from recalculating over and over again. The total number of nodes for a subtree is simply the total number of nodes of its left and right subtrees plus one.
This algorithm runs in O(N) time complexity and O(1) space, which is as efficient as it can be. (where N is the total number of nodes in the binary tree).
Below is the C++ code that works. The tricky part of the implementation is taking care of how min and max values are passed bottom-up. Please note that I did not initialize min and max values as they are initialized in the bottom of the tree.
// Find the largest BST subtree in a binary tree.
// If the subtree is a BST, return total number of nodes.
// If the subtree is not a BST, -1 is returned.
int findLargestBSTSubtree(BinaryTree *p, int &min, int &max,
int &maxNodes, BinaryTree *& largestBST) {
if (!p) return 0;
bool isBST = true;
int leftNodes = findLargestBSTSubtree(p->left, min, max, maxNodes, largestBST);
int currMin = (leftNodes == 0) ? p->data : min;
if (leftNodes == -1 ||
(leftNodes != 0 && p->data <= max))
isBST = false;
int rightNodes = findLargestBSTSubtree(p->right, min, max, maxNodes, largestBST);
int currMax = (rightNodes == 0) ? p->data : max;
if (rightNodes == -1 ||
(rightNodes != 0 && p->data >= min))
isBST = false;
if (isBST) {
min = currMin;
max = currMax;
int totalNodes = leftNodes + rightNodes + 1;
if (totalNodes > maxNodes) {
maxNodes = totalNodes;
largestBST = p;
}
return totalNodes;
} else {
return -1; // This subtree is not a BST
}
}
BinaryTree* findLargestBSTSubtree(BinaryTree *root) {
BinaryTree *largestBST = NULL;
int min, max;
int maxNodes = INT_MIN; // INT_MIN is defined in <climits>
findLargestBSTSubtree(root, min, max, maxNodes, largestBST);
return largestBST;
}
I guess the problem you're trying to solve is finding the largest (with more nodes) BST in BT. In that case you'll need to traverse all the tree nodes checking if it is a BST, and once you find one you'll have to check if it has more nodes than the largest one found till the moment.
class TreeNode
{
public int value;
public TreeNode left;
public TreeNode right;
}
void LargestBST(TreeNode bt, IDictionary<TreeNode, bool> isBST, IDictionary<TreeNode, int> nodeCount, ref TreeNode largestBST)
{
if (bt == null)
return;
if (IsBST(bt, isBST) && (largestBST == null || NodeCount(bt, nodeCount) > NodeCount(largestBST, nodeCount))
largestBST = bt;
else
{
LargestBST(bt.left, isBST, nodeCount, ref largestBST);
LargestBST(bt.Right, isBST, nodeCount, ref largestBST);
}
}
bool IsBST(TreeNode node, IDictionary<TreeNode, bool> isBST)
{
if (node == null)
return true;
bool result;
if (!isBST.TryGetValue(node, out result))
{
TreeNode maxLeft = Max(node.Left);
TreeNode minRight = Min(node.Right);
result = (maxLeft == null || maxLeft.value <= node.value) &&
(minRight == null || minRight.value >= node.value) &&
IsBST(node.left, isBST) && IsBST(node.Right, isBST);
isBST.Add(node, result);
}
return result;
}
TreeNode Max(TreeNode node)
{
if (node == null)
return null;
while (node.right != null)
node = node.right;
return node;
}
TreeNode Min(TreeNode node)
{
if (node == null)
return null;
while (node.left != null)
node = node.left;
return node;
}
int NodeCount(TreeNode node, IDictionary<TreeNode, int> nodeCount)
{
if (node == null)
return 0;
int result;
if (!nodeCount.TryGetValue(node, out result))
{
result = 1 + NodeCount(node.left, nodeCount) + NodeCount(node.right, nodeCount);
nodeCount.Add(node, result);
}
return result;
}
The tree is a BST if its in-order traversal gives you its elements in sorted order. You can use this code here if you want an example implementation: http://placementsindia.blogspot.com/2007/12/c-program-to-check-whether-binary-tree.html
The running time is O(N) where N = number of nodes.
Considering the tree a BST if the root's two subtrees are both BST is wrong (and to the person who deleted his answer that proposed this solution: you shouldn't have deleted your answer, personally I wasn't going to downvote you and there is as much to learn from a bad-but-seemingly-good solution as there is from a good one). Counterexample:
3
/ \
2 4
/ \
1 5
Now, to get the largest subtree that is a BST, consider this tree:
3
/ \
2 4
/ \
1 5
The inorder-traversal is 1 2 5 3 4. I think you can solve your original problem by finding the maximum-length sorted contiguous subsequence in the inorder-traversal. You just have to be careful not to select sequences that don't describe a BST. For example, for:
10
/ \
2 14
/ \ |
1 5 20
The inorder-traversal is 1 2 5 10 20 14. Don't select the 20. I think this can be accomplished by making sure you dismiss elements as long as their selection stops making sense. For example, when you reach 14, dismiss the 20. I'm not sure if this can be efficiently done however. I'll edit my post if I find an exact way.
I would think you could avoid checking if every node is the root of a BST by working top down instead of bottom up. If a subtree is a BST it's going to be larger than any subtree in itself so you don't need to check inside a subtree if it has passed the isBST test. Then you just have isBST return the size of a valid tree and store that along with a pointer to the root of the subtree if you need to be able to find it again instead of just knowing how large the largest one is.
UPDATE:
Some of the code posted here to check if something is a BST are going to fail some cases since they're only checking the parent of a node.
Take for example:
10 / \ 4 99 / 2
This is not a valid BST, (the 2 is out of position with regards to the 10) but if you don't send a min and max value down through the tree you will incorrectly verify it as valid. This pseudocode takes that into account.
main
{
Verify(root, MIN_VALUE, MAX_VALUE)
}
boolean Verify(node , min, max)
{
if(node == null)
return true;
if(node.value > min &&
node.value < max &&
Verify(node.leftchild, min, node.value) &&
Verify(node.rightchild,node.value,max)
{
return true;
}
else
{
return false;
}
}
int getMinMaxValue(Node* root, bool isMin)
{
if (!root)
{
// Not real limits...
return (isMin ? INT_MAX : INT_MIN);
}
int leftVal = getMinMaxValue(root->left, isMin);
int rightVal = getMinMaxValue(root->right, isMin);
if (isMin)
{
return min(root->value, min(leftVal, rightVal));
}
else
{
return max(root->value, max(leftVal, rightVal));
}
}
bool isBST(Node* root)
{
if (!root)
{
return true;
}
Node* left = root->left;
Node* right = root->right;
if (left)
{
if (getMinMaxValue(left, false) > root->value)
{
return false;
}
}
if (right)
{
if (getMinMaxValue(right, true) < root->value)
{
return false;
}
}
return isBST(left) && isBST(right);
}
Then just descend from the root node checking if the subtree is BST, and take the largest one.
To verify if a node is root of a BST , we have to recursively check each left and right children . If you start from root , you will have to recur all children before you can decide root of the binary tree is a BST or not . So it do not make any sense to call "isBST" for each node . Here's my approach
- Start from root
- Find max from left and right
- If not max on left and right return "NOT BST"
- if BST on left , check if it's bigger than the max so far . If yes store it and return "NOT BST"
- if BST on right, check if it's bigger than the max so far . If yes store it and return "NOT BST"
- If left and right are BST , there's a new BST with current root as ROOT and with number of nodes left + right + 1
Couple of challenges in making this work is storing max so far for which i used a ref variable MaxNumNodes. maxbst withh have the root of the biggest BST found when the function return .
public int MaxBST(Node root, int min, int max, ref Node maxbst,
ref int MaxNumNodes)
{
if (root == null) return 0;
//Not a BST
if (root.data < min || root.data > max) return -1;
//Find Max BST on left
int left = MaxBST(root.left, min, root.data, ref maxbst,
ref MaxNumNodes);
//Find Max BST on right
int right = MaxBST(root.right, root.data + 1, max, ref maxbst,
ref MaxNumNodes);
//Case1: -1 from both branches . No BST in both branches
if (left == -1 && right == -1) return -1;
//Case2:No BST in left branch , so choose right
//See if the BST on right is bigger than seen so far
if (left == -1)
{
if (right> MaxNumNodes)
{
MaxNumNodes = right;
maxbst = root.right;
}
return -1;
}
//Case3:No BST in right branch , so choose left
//See if the BST on left is bigger than seen so far
if (right == -1)
{
if (left > MaxNumNodes)
{
MaxNumNodes = left;
maxbst = root.left;
}
return -1;
}
//Case4:Both are BST , new max is left BST + right BST
maxbst = root;
return left + right + 1;
}
This isn't the most optimal approach but you could do an inorder traversal of the Binary tree and store it in an array and then find the longest contiguous increasing sequence, that will give you the BST with the maximum number of nodes. The complexity would be O(n)
for the traversal and O(n)
for the search so it's still going to be O(n)
.
One possible solution would be as following -
int maxNodes = INT.MIN;
Node* lb = NULL;
int largesBST(Node* root) {
largesBST(root, INT.MIN, INT.MAX);
}
int largesBST(Node* p, int MIN, int MAX) {
if(!p) { return 0; }
if(p->data > MIN || p->data < MAX) { return -1; }
int lc = largestBST(p->left, p->data, MAX);
if(lc == -1) { return -1; }
int rc = largestBST(p->right, MIN, p->data);
if(rc == -1) { return -1; }
// At this point, cur node is BST
int curNodes = lc + rc + 1;
if(curNodes > maxNodes) {
maxNodes = curNodes;
lb = p;
}
}
To solve the above problem:
- one can do an IN-order traversal of the tree
- Store it in a array and find the 'largest sorted subset'.
精彩评论