optimize this ruby code, switch arrays to sets/hash?
I need to optimize this code. Any suggestions to make it go faster, please tell me. I don't have a specific amount that I want it to go faster, any suggestion would be helpful. 开发者_如何学编程In terms of complexity I want to keep it below O(n^2)
I'm wondering if trying to convert the array that I'm using into like a set or hash because that is quicker right? How much faster in terms of complexity might this allow me to run?
The main problem I think might be my use of the ruby combination function which runs pretty slow, does anyone know exactly the complexity for this ruby function? is there a faster alternative to this?
the point of this code is basically to find the single point that is the shortest combined distance from all the other points ie (the friends house that is most convenient for everyone to go to). there is a little extra code here which has some debugging/printing functions.
class Point
attr_accessor :x, :y, :distance, :done, :count
def initialize(x,y)
@x = x
@y = y
@distance = 0
@closestPoint = []
@done = false
@count = 0
end
end
class Edge
attr_accessor :edge1, :edge2, :weight
def initialize(edge1,edge2,weight)
@edge1 = edge1
@edge2 = edge2
@weight = weight
end
end
class AdjacencyList
attr_accessor :name, :minSumList, :current
def initialize(name)
@name = name
@minSumList = []
@current = nil
@vList = []
@edgeList = []
end
def addVertex(vertex)
@vList.push(vertex)
end
def generateEdges2
minSumNode = nil
current = nil
last = nil
@vList.combination(2) { |vertex1, vertex2|
distance = distance2points(vertex1,vertex2)
edge = Edge.new(vertex1,vertex2,distance)
if (current == nil)
current = vertex1
minSumNode = vertex1
end
vertex1.distance += distance
vertex2.distance += distance
vertex1.count += 1
vertex2.count += 1
if (vertex1.count == @vList.length-1)
vertex1.done = true
elsif (vertex2.count == @vList.length-1)
vertex2.done = true
end
if ((vertex1.distance < minSumNode.distance) && (vertex1.done == true))
minSumNode = vertex1
end
#@edgeList.push(edge)
}
return minSumNode.distance
end
def generateEdges
@vList.combination(2) { |vertex1, vertex2|
distance = distance2points(vertex1,vertex2)
@edgeList.push(Edge.new(vertex1,vertex2,distance))
}
end
def printEdges
@edgeList.each {|edge| puts "(#{edge.edge1.x},#{edge.edge1.y}) <=> (#{edge.edge2.x},#{edge.edge2.y}) weight: #{edge.weight}"}
end
def printDistances
@vList.each {|v| puts "(#{v.x},#{v.y} distance = #{v.distance})"}
end
end
def distance2points(point1,point2)
xdistance = (point1.x - point2.x).abs
ydistance = (point1.y - point2.y).abs
total_raw = xdistance + ydistance
return totaldistance = total_raw - [xdistance,ydistance].min
end
#pointtest1 = Point.new(0,1)
#pointtest2 = Point.new(2,5)
#pointtest3 = Point.new(3,1)
#pointtest4 = Point.new(4,0)
graph = AdjacencyList.new("graph1")
gets
while (line = gets)
graph.addVertex(Point.new(line.split[0].to_i,line.split[1].to_i))
end
#graph.addVertex(pointtest1)
#graph.addVertex(pointtest2)
#graph.addVertex(pointtest3)
#graph.addVertex(pointtest4)
puts graph.generateEdges2
#graph.printEdges
#graph.printDistances
Try to do this, and then post some more code:
ruby -rprofile your_script your_args
This will run the script under the profiler, and generate a nice table with results. If you post that here, it's more likely to get better help. Plus, you will have a more exact idea of what's consuming your CPU cycles.
Sets are basically hashes, and the advantage of hashes over arrays is O(1) find operations. Since you are simply iterating over the entire array, hashes will not offer any speed improvements if you simply replace the arrays with hashes.
Your real problem is that the running time of your algorithm is O(n^2), as in given a set of n points it will have to perform n^2 operations since you're matching every point with every other possible point.
This can be somewhat improved using hashes to cache values. For example, lets say you want the distance between point "a" and point "b". You could have a hash @distances
which stores @distances["a,b"] = 52
(of course you'll have to be smart about what to use as the key). Basically just try to remove redundant operations wherever you can.
That said, the largest speed boost would be from a smarter algorithm, but I can't think of something applicable off the top of my head right now.
There's something many people know, and it won't cost you anything.
While you're trying to guess how to make the code faster, or scouring the internet for some kind of profiler, just run the program under the debugger and interrupt it while it's being slow.
Do it several times, and each time take careful note of what it's doing and why.
Here's an example in python.
The slower it is, the more obvious the problem will be.
精彩评论