C#: Avoid infinite recursion when traversing object graph
I have an object graph wherei开发者_开发问答n each child object contains a property that refers back to its parent. Are there any good strategies for ignoring the parent references in order to avoid infinite recursion? I have considered adding a special [Parent] attribute to these properties or using a special naming convention, but perhaps there is a better way.
If the loops can be generalised (you can have any number of elements making up the loop), you can keep track of objects you've seen already in a HashSet
and stop if the object is already in the set when you visit it. Or add a flag to the objects which you set when you visit it (but you then have to go back & unset all the flags when you're done, and the graph can only be traversed by a single thread at a time).
Alternatively, if the loops will only be back to the parent, you can keep a reference to the parent and not loop on properties that refer back to it.
For simplicity, if you know the parent reference will have a certain name, you could just not loop on that property :)
What a coincidence; this is the topic of my blog this coming Monday. See it for more details. Until then, here's some code to give you an idea of how to do this:
static IEnumerable<T> Traversal<T>(
T item,
Func<T, IEnumerable<T>> children)
{
var seen = new HashSet<T>();
var stack = new Stack<T>();
seen.Add(item);
stack.Push(item);
yield return item;
while(stack.Count > 0)
{
T current = stack.Pop();
foreach(T newItem in children(current))
{
if (!seen.Contains(newItem))
{
seen.Add(newItem);
stack.Push(newItem);
yield return newItem;
}
}
}
}
The method takes two things: an item, and a relation that produces the set of everything that is adjacent to the item. It produces a depth-first traversal of the transitive and reflexive closure of the adjacency relation on the item. Let the number of items in the graph be n, and the maximum depth be 1 <= d <= n, assuming the branching factor is not bounded. This algorithm uses an explicit stack rather than recursion because (1) recursion in this case turns what should be an O(n) algorithm into O(nd), which is then something between O(n) and O(n^2), and (2) excessive recursion can blow the stack if the d is more than a few hundred nodes.
Note that the peak memory usage of this algorithm is of course O(n + d) = O(n).
So, for example:
foreach(Node node in Traversal(myGraph.Root, n => n.Children))
Console.WriteLine(node.Name);
Make sense?
If you're doing a graph traversal, you can have a "visited" flag on each node. This ensures that you don't revisit a node and possibly get stuck in an infinite loop. I believe this is the standard way of performing a graph traversal.
This is a common problem, but the best approach depends on the scenario. An additional problem is that in many cases it isn't a problem visiting the same object twice - that doesn't imply recursion - for example, consider the tree:
A
=> B
=> C
=> D
=> C
This may be valid (think XmlSerializer
, which would simply write the C
instance out twice), so it is often necessary to push/pop objects on a stack to check for true recursion. The last time I implemented a "visitor", I kept a "depth" counter, and only enabled the stack checking beyond a certain threshold - that means that most trees simply end up doing some ++
/--
, but nothing more expensive. You can see the approach I took here.
I'm not exactly sure what you are trying to do here but you could just maintain a hashtable with all previously visited nodes when you are doing your breadth first search of depth first search.
I published a post explaining in detail with code examples how to do object traversal by recursive reflection and also detect and avoid recursive references to prevent a stack over flow exception: https://doguarslan.wordpress.com/2016/10/03/object-graph-traversal-by-recursive-reflection/
In that example I did a depth first traversal using recursive reflection and I maintained a HashSet of visited nodes for reference types. One thing to be careful is to initialize your HashSet with your custom equality comparer which uses the object reference for hash calculation, basically the GetHashCode() method implemented by the base object class itself and not any overloaded versions of GetHashCode() because if the types of properties you traverse overload GetHashCode method, you may detect false hash collisions and think that you detected a recursive reference which in reality could be that the overloaded version of GetHashCode producing the same hash value via some heuristics and confusing the HashSet, all you need to detect is to check if there is any parent child in anywhere in the object tree pointing to the same location in memory.
精彩评论