Coalescing a tree of named profile samples into a top-down summation
I have an array of arrays of names that could be represented in Ruby like this:
samples = [
%w[a],
%w[a c],
%w[a],
%w[a],
%w[b],
%w[b],
%w[a],
%w[a e],
%w[a e],
%w[a c d],
%w[a c d],
%w[b],
%w[b c e],
%w[b c e],
%w[a c],
%w[a e],
%w[a e]
]
These are the output of a sampling profiler, where each list of names represents the call stack for a particular sample. I want to display these as a top-down tree of named-values where the value at each node is the sum of hits to that particular call path.
For the above sample input, the output tree should be:
root:0
a:4
e:4
c:2
d:2
b:3
c:0
e:2
(I don't want an ASCII output as shown above, but rather a tree开发者_运维知识库 structure that represents this.)
What is simple, efficient code that produces this output?
I have my own solution which I will post as an answer, but which seems to me less than ideal.Edit: I forgot to include the fact that the tree should be sorted in descending value at each level. I've added sample nodes and changed the output to reflect this.
[edit] A recursive Tree class with functional programming (I use ostruct for the sake of simplicity):
require 'ostruct'
class Tree < OpenStruct
def self.new_from_array(plain)
Tree.new(:node => "root", :count => 0, :children => children_from_array(plain))
end
def self.children_from_array(plain)
plain.group_by(&:first).map do |node, group|
terminal, leaves = group.map { |xs| xs.drop(1) }.partition(&:empty?)
Tree.new(:node => node, :count => terminal.size, :children => children_from_array(leaves))
end.sort_by(&:count).reverse
end
def inspect(indent=0)
node_info = " "*indent + "#{self.node}: #{self.count}"
([node_info] + self.children.map { |tree| tree.inspect(indent+2) }).join("\n")
end
end
Example:
>> Tree.new_from_array(samples)
=>
root: 0
a: 4
e: 4
c: 2
d: 2
b: 3
c: 0
e: 2
You can customize inspect
to fit your visualization needs.
Does this get you started?
>> samples.each_with_object(Hash.new(0)) { |arr, h| h[arr] += 1 }
#=> {["a"]=>4, ["a", "c"]=>2, ["b"]=>3, ["a", "c", "d"]=>2, ["b", "c", "e"]=>2}
Once you have the counts for each array, you can start grouping them using group_by
or something similar.
Note that if you are on 1.8 you'll have to use inject
instead of each_with_object
:
>> samples.inject(Hash.new(0)) { |h, arr| h[arr] += 1 ; h}
#=> {["a"]=>4, ["a", "c"]=>2, ["b"]=>3, ["a", "c", "d"]=>2, ["b", "c", "e"]=>2}
Sorry, I'm leaving now so don't have much more time... If this is not enough post a comment and I'll try to answer later.
Here's a far better answer than my first, inspired by @MichaelKohl:
require 'pnode' # see below
root = PNode.new("root",0)
samples.group_by{ |o| o }.each do |callstack,instances|
n = root; last_index = callstack.length-1
callstack.each_with_index do |name,i|
n = n[name]
n.time += instances.length if i==last_index
end
end
root.sort!
puts root
# pnode.rb
class PNode
attr_accessor :name, :time, :parent, :kids
def initialize(name,time=0,parent=nil)
@name, @time, @parent = name, time, parent
@kids = []; @by_name = {}
end
def []( name )
kids << (@by_name[name] = self.class.new(name,0,self)) unless @by_name[name]
@by_name[name]
end
def sort!
@kids.each(&:sort!).sort_by!{ |n| [-n.time,n.name] }
self
end
def to_s(lv=0)
[ "#{'..'*lv}#{name}:#{time}", *kids.map{|k| k.to_s(lv+1) } ].join("\n")
end
end
Here is a simple recursive method that will create the tree. This can be easily adapted based in the methods that you will need.
def to_tree(data)
tree = data.inject([0,{}]) do |cont, element|
if element.empty?
cont[0] += 1
else
node = element.shift
cont[1][node] ||= []
cont[1][node] << element
end
cont
end
tree[1].map do |k, v|
tree[1][k] = to_tree(v)
end
[tree[0],tree[1].sort{|a,b| b[1][0] <=> a[1][0]}]
end
def p_tree(node_tree, node_name='root', node_level=0)
p "#{' '*node_level}#{node_name}:#{node_tree[0]}"
node_tree[1].each do |node|
p_tree(node[1], node[0], node_level + 1)
end
end
samples = [
%w[a],
%w[a c],
%w[a],
%w[a],
%w[b],
%w[b],
%w[a],
%w[a e],
%w[a e],
%w[a c d],
%w[a c d],
%w[b],
%w[b c e],
%w[b c e],
%w[a c],
%w[a e],
%w[a e]
]
p_tree(to_tree(samples))
The method that is currently printing the tree can be easily modified to create the tree object instead.
To try to answer the question, can't you just build a tree by inserting each sample one at a time? and sort them as you go or at the end?
If you don't mind some additional suggestions, illustrated by this example :
If you're worried about efficiency because you've got jillions of stack samples, if you just randomly use about 100 of them, you'll get plenty enough information for finding bottlenecks. Here's why.
The problem with building a tree whose root is the main program is that there may not actually be a "hot path", even if there's a bona fide hot bottleneck, as that example shows. The solution is to allow the user to pick any routine as the "root", such as
c
in your example, which is active 6/17 of the time.When you get each stack sample, you not only get a routine name, but a line number. If the user is interested in a "hot" routine (that appears on many samples), they're going to want to know where inside the routine the time was spent. That information is in those line numbers. So I suggest lines, not routines be your basic unit of tree-building.
Suppose there is recursion, so the same routine/line can appear more than once in a sample. What do you do? Well, regardless of how many times a routine/line appears in a sample, it is still only one sample. The cost of a line is just the fraction of samples that contain it, with or without recursion, because suppose you are taking samples every 10ms. If that routine/line could be made to take no time (say, by removing or avoiding it), all the samples containing it would disappear from the total, recursion or no.
So if I'm building a butterfly view focussed on a single line, I only need to show one layer of calls below it, and one above. So I build a local down-tree showing just the lines that appeared below the focus (in the samples), and likewise for the up-tree, regardless of recursion.In my experience I do a lot of this by hand, and automation would help, but once I want to investigate a particular hot line, I want to be able to see individual stack samples going through that line. The reason is I don't care about the measurement. I know it's hot. I care about why it was doing what it was doing, so I know if I can do without it. That's what the stack samples tell me. Just looking at a hot line isn't enough information.
I know that's more than you asked for, but the simple fact that you are taking stack samples and looking for a way to summarize them means you're really on the right track, IMO. Good luck.
P.S. Don't you want to store inclusive counts on each node, as in:
root:17
a:12
e:4
c:4
d:2
b:5
c:2
e:2
because that will tell you the cost (in the sense of potential savings) of each node. If you're concerned about "exclusive time", note point 8 of this post.
Here's my less-than-ideal solution. In my defense, it was crafted with changing input, so it first transforms to a data structure that was originally fed in:
PNode = Struct.new(:name,:time,:parent) do
attr_writer :kids
def kids; @kids||=[]; end
def add( name, time=0 )
self.class.new( name, time, self ).tap{ |n| kids << n }
end
def to_s(lv=0)
[ "#{'..'*lv}#{name}:#{time}", *kids.map{|k| k.to_s(lv+1) } ].join("\n")
end
end
module Enumerable
def sum(init=0,method=:to_i)
inject(init){ |sum,o| sum+o.send(method) }
end
end
# Build a tree of calls
root = PNode.new('root',0)
samples.each do |callstack|
n = root; last_index = callstack.length-1
callstack.each_with_index do |name,i|
n = n.add( name, i==last_index ? 1 : 0 )
end
end
# Sum the tree values at each level
def top_down(nodes,lv=0,parent=nil)
nodes.group_by(&:name).sort_by{ |name,ns|
[ -ns.map(&:time).sum, name ]
}.map{ |name,same_name_calls|
self_time = same_name_calls.map(&:time).sum
PNode.new(name,self_time,parent).tap do |x|
x.kids = top_down( same_name_calls.map(&:kids).flatten, lv+1, x )
end
}
end
puts top_down(root.kids).map(&:to_s)
精彩评论