Algorithm/Data structure to rank elements in a tree
Here's what I have: a tree with an arbitrary number of levels. I need a way to rank all of the nodes at each level FOR each level. If that's not clear, let's say my first level is the World. My second level is continents. My third level is countries. My fourth level is cities. Each country has a list of it's cities, ranked in order of population. Each continent has a list of countries ranked by population. Each continent ALSO has a list of cities ranked by population. And etc.
The algorithm I imagine is pretty simple recursion, but I'm not sure what the best data structure would be to keep track of these lists. Each level doesn't know how many sub levels it has, so I can't declare an arbitrary number of lists.
Any thoughts?
Here's some sample code:
public void calcStats()
{
initWorldRanks();//clears ranks for the world
for(Entity continent:theWorld.getChildren())
{
initContinentRanks();//clears ranks for the continent
开发者_StackOverflow for(Entity country:continent.getChildren())
{
initCountryRanks();//clears ranks for the country
for(Entity city:country.getChildren())
{
//Assume that add preserves sorted order. Sorting is easy. The tricky part is that each entity needs to be added to its ancestors. I don't want to have fixed data structures
worldCityRanks.add(city);
continentCityRanks.add(city);
countryCityRanks.add(city);
}
worldCountryRanks.add(country);
continentCountryRanks.add(country);
}
worldContinentRanks.add(continent);
}
Everything is correctly ranked but this restricts me to a definite 4 level structure.
The key thing is that you don't want to have to recompute the count for each node by traversing its entire subtree. Cache the total count in each node. Each node then only needs to collect the values from its children to compute its own total (which it should also cache).
You don't say whether these nodes are mutable or not. If they're immutable, then it's easy: you construct a node's total when all of it's children are added at construction time.
If they're mutable you can have each node tell its parent when its count changes. The parent could update its own count and tell its parent, and so-on up the tree. This makes updating a count O(depth of tree) or roughly O(logn) (depending on how well-balanced your tree is).
For actually sorting each node's children do whatever you'd normally do: use an ArrayList
and sort it, or use some kind of sorted collection that maintains sort order (eg: TreeSet
, though make sure you distingiguish between elements that have the same population). The important thing is that you'll only look at your immediate children's value (ie: the cached sum) when comparing, never your indirect descendants.
Update
Based on your update to the question, one of your problems is that you've get separate methods for adding things at different levels. ie: worldCityRanks.add
, continentCityRanks.add
, countryCityRanks.add
, etc. You should replace these all with a single method that takes the depth as a parameter. eg:
// Probably in your Entity class
public void addDescendant(int distance, Entity descendant) {
// this replaces worldCityRanks.add, continentCityRanks.add,
// countryCityRanks.add, etc.
}
Then instead of having 4 fields for your descendant collections, you'd have a collection (probably an ArrayList
) to hold them. You'd expand this as necessary.
Another problem is that you have these hard-coded nested for loops. To handle arbitrary (within reason) depth the easiest approach is to use recursion. eg:
public void calcStats() {
theWorld.initAllRanks();
List<Entity> ancestors = new ArrayList<Entity>();
theWorld.accumulateAllRanks(ancestors);
}
class Entity ... {
...
void initAllRanks() {
initRanks();
for(Entity child: getChildren()) {
child.initAllRanks();
}
}
void accumulateAllRanks(List<Entity> ancestors) {
int distance = ancestors.size();
for(Entity ancestor: ancestors) {
distance--;
ancestor.addDescendant(distance, this);
}
ancestors.add(this); // push this
for(Entity child: getChildren()) {
child.accumulateAllRanks(ancestors);
}
ancestors.remove(ancestors.size() - 1); // pop this
}
This is assuming you really want to store rankings for each level (which is what your code sample implies). That approach makes lookups fast, but it can make updates slow and it also consumes more memory than some other approaches. In particular, you could just maintain lists of global rankings, and then filter these lists at query time. Again, this makes updates faster and consumes less memory, but makes queries slower than the approach you appear to be using currently.
精彩评论