Tree structure as string - how to match nested braces?
I have a tree structure set up and wish to save it to / read it from a string, with a minimal amount of text (so XML serialization is out). I set up a simple (or so I thought) structure for this, but can't figure out how to read it in, so my structure will most likely have to change. Let me demonstrate with an example.
My tree is made up of X,Y coordinates, like in the following example:
[a,b]
|-----|
[c,d] [e,f]
|-----|-----|
[g,h] [i,j] [k,l]
When I run my algorithm to turn this tree into a string, I get the following output:
a,b(c,d()e,f(g,h()i,j()k,l()))
And here it the code I'm using:
public string SerializeMe()
{
StringBuilder ret = new StringBuilder(this.Value.ToString())
ret.Append("(");
foreach (SimpleTreeNode<T> child in _Children)
{
ret.Append(child.SerializeMe());
}
ret.Append(")");
return ret.ToString();
}
That works great, but now I can't parse the string back into my tree structure. I can get the substring up to the first open brace and convert that to the node's value, but I'm unsure how to split the rest of the string into children. Is there any way I can easily find an opening brace, and t开发者_如何学编程hen find it's closing brace? I've looked into some complicated regex stuff that I couldn't get working properly and quickly became completely lost.
Does anyone have any ideas?
EDIT:
Here's the code I have so far:public static SimpleTreeNode<SPoint> ParsePointTree(string input)
{
//if the input string is empty, there is no node here. Return null.
if (string.IsNullOrEmpty(input)) return null;
else
{
//get the value from the first part of the string
string valString = input.Substring(0, input.IndexOf('('));
SPoint value = (SPoint)valString;
SimpleTreeNode<SPoint> node = new SimpleTreeNode<SPoint>(value);
//now we have the child nodes enclosed in brackets
string innerstring = input.Substring(input.IndexOf('('));
List<string> children = new List<string>();
// how do we split innerstring into siblings?? //
foreach (string child in children)
{
node.Children.Add(SimpleTreeNode<SPoint>.ParsePointTree(child));
}
return node;
}
}
The problem I'm having, is that I will get a string that has to be split up into siblings. In the example above, c,d
and e,f
are siblings, represented in the form (c,d()e,f(g,h()i,j()k,l()))
. I need to split this string into c,d()
and e,f(g,h()i,j()k,l())
, which is where I'm stuck.
You can parse a string like that using a stack and 2 local variables. The stack would not be necessary if you serialized the tree using a breadth first traversal rather than depth first (btw it does not have to be recursive in either case).
The recursive solution just utilizes the call stack and could result in a stack overflow - see here for a better explanation of why that's not the best way.
foreach (char c in "a(c()e(g()i()k()))")
{
if (c == '(')
{
Stack.Push(Parent);
Parent = Child;
}
else if (c == ')')
{
Child = Parent;
Parent = Stack.Pop();
}
else
{
Child = new SimpleTreeNode() { Value = c };
Parent.Children.Add(Child);
}
}
Something like this (pseudo code):
function parse() =
label = read_until('(',')');
read_char('(')
children = []
while not peek_char(')') do
child = parse()
children.add(child)
read_char(')')
return new Node(label,children)
read_until(...)
reads until, but not including, the given characters.read_char(c)
reads a character, and if it doesn't match raises an error.peek_char(c)
looks at the next character, and returns a truth-value indicating whether it matched.
精彩评论