Dependencies Tree Implementation
For those who used apt-get, you know that everytime you install / uninstall something, you get the notificatons saying you need / no longer need certain dependencies.
I'm trying to understand the theory behinds this a开发者_C百科nd potentially implement my own version of this. I've done some googling, came up with mostly coupling stuff. From what I understand, coupling is 2 classes/modules that depends on each other. This is not exactly what I'm looking for. What I'm looking for is more like a dependencies tree generation, where I can find the least dependent module (I've already made a recursive way of doing this), and (this being the part i haven't done) finding what's no longer needed after removing a node.
Also, would learning about graph theory help? And is there any tutorials on that preferably using Python as a language?
Graph theory is the way to go.
A graph is just a bunch of places (vertices) with roads (edges) between them, and in particular we're talking about a directed graph, which means one-way roads. Figuring out dependencies means, basically, finding out all the places that can reach a particular town along the one-way roads.
So now, you've got you bunch of modules, which become your vertices. Let's say we have A and B, and we know B depends on A, so there's a directed edge -- a "one way road" -- from A to B.
If C depends on B, then you have A→B→C.
Formally, a graph is just a collection of vertices and (ordered) pairs of vertices, called the edges. You want an graph algorithm called "topological sort", and now you've got some stuff to read.
This might be of some interest:
def dep(arg):
'''
Dependency resolver
"arg" is a dependency dictionary in which
the values are the dependencies of their respective keys.
'''
d=dict((k, set(arg[k])) for k in arg)
r=[]
while d:
# values not in keys (items without dep)
t=set(i for v in d.values() for i in v)-set(d.keys())
# and keys without value (items without dep)
t.update(k for k, v in d.items() if not v)
# can be done right away
r.append(t)
# and cleaned up
d=dict(((k, v-t) for k, v in d.items() if v))
return r
if __name__=='__main__':
d=dict(
a=('b','c'),
b=('c','d'),
e=(),
f=('c','e'),
g=('h','f'),
i=('f',)
)
print dep(d)
I did wrote a tool for finding and drawing the dependencies between Python packages on PyPi. It's gluttony
And I did use to analysis the dependencies of some library I'm using. Here are some of diagrams:
I'm not sure is this what you want. If it is, you can read the source code here, it is an open source project. For more dependencies diagrams, you can view the gallery
Talking about how I implement it, for finding package dependencies, I use pip as backend. For drawing diagrams, I use Networkx.
- Walk the old dependency tree. Build a
set
of all of the elements in it. - Walk the new dependency tree. Do the same as before.
- Subtract the latter
set
from the former. The result is your answer.
Alternatively:
- Walk the old dependency tree. At each node, store the number of things that depend on (point to) that node.
- Remove the item you're wanting to remove. Follow all of its dependencies and decrement their usage count by 1. If decrementing this way lowers the count for a node to 0, it is no longer necessary.
The former is simpler to implement, but less efficient. The latter is efficient, but harder to implement.
I think the step you're looking for is to differentiate between packages you has explicitly installed, and those that are dependencies. Once you have done this you can build dependency trees of all requested packages, and compare the set of these packages with the set of installed packages. XORing these, assuming all requested packages are installed, should give you the set of all packages which are no longer depended on.
精彩评论