Merge N sorted arrays in ruby lazily
How does one merge N sorted arrays (or other l开发者_如何转开发ist-like data structures) lazily in Ruby? For example, in Python you would use heapq.merge. There must be something like this built into Ruby, right?
Here's a (slightly golfed) solution that should work on arrays of any 'list-like' collections that support #first
, #shift
, and #empty?
. Note that it is destructive - each call to lazymerge
removes one item from one collection.
def minheap a,i
r=(l=2*(m=i)+1)+1 #get l,r index
m = l if l< a.size and a[l].first < a[m].first
m = r if r< a.size and a[r].first < a[m].first
(a[i],a[m]=a[m],a[i];minheap(a,m)) if (m!=i)
end
def lazymerge a
(a.size/2).downto(1){|i|minheap(a,i)}
r = a[0].shift
a[0]=a.pop if a[0].empty?
return r
end
p arrs = [ [1,2,3], [2,4,5], [4,5,6],[3,4,5]]
v=true
puts "Extracted #{v=lazymerge (arrs)}. Arr= #{arrs.inspect}" while v
Output:
[[1, 2, 3], [2, 4, 5], [4, 5, 6], [3, 4, 5]]
Extracted 1. Arr= [[2, 3], [2, 4, 5], [4, 5, 6], [3, 4, 5]]
Extracted 2. Arr= [[3], [2, 4, 5], [4, 5, 6], [3, 4, 5]]
Extracted 2. Arr= [[4, 5], [3], [4, 5, 6], [3, 4, 5]]
Extracted 3. Arr= [[4, 5], [3, 4, 5], [4, 5, 6]]
Extracted 3. Arr= [[4, 5], [4, 5], [4, 5, 6]]
Extracted 4. Arr= [[5], [4, 5], [4, 5, 6]]
Extracted 4. Arr= [[5], [5], [4, 5, 6]]
Extracted 4. Arr= [[5, 6], [5], [5]]
Extracted 5. Arr= [[6], [5], [5]]
Extracted 5. Arr= [[5], [6]]
Extracted 5. Arr= [[6]]
Extracted 6. Arr= [[]]
Extracted . Arr= [[]]
Note also that this algorithm is also lazy about maintaining the heap property - it is not maintained between calls. This probably causes it to do more work than needed, since it does a complete heapify on each subsequent call. This could be fixed by doing a complete heapify once up front, then calling minheap(a,0)
before the return r
line.
I ended up writing it myself using the data structures from the 'algorithm' gem. It wasn't as bad as I expected.
require 'algorithms'
class LazyHeapMerger
def initialize(sorted_arrays)
@heap = Containers::Heap.new { |x, y| (x.first <=> y.first) == -1 }
sorted_arrays.each do |a|
q = Containers::Queue.new(a)
@heap.push([q.pop, q])
end
end
def each
while @heap.length > 0
value, q = @heap.pop
@heap.push([q.pop, q]) if q.size > 0
yield value
end
end
end
m = LazyHeapMerger.new([[1, 2], [3, 5], [4]])
m.each do |o|
puts o
end
Here's an implementation which should work on any Enumerable, even infinite ones. It returns Enumerator.
def lazy_merge *list
list.map!(&:enum_for) # get an enumerator for each collection
Enumerator.new do |yielder|
hash = list.each_with_object({}){ |enum, hash|
begin
hash[enum] = enum.next
rescue StopIteration
# skip empty enumerators
end
}
loop do
raise StopIteration if hash.empty?
enum, value = hash.min_by{|k,v| v}
yielder.yield value
begin
hash[enum] = enum.next
rescue StopIteration
hash.delete(enum) # remove enumerator that we already processed
end
end
end
end
Infinity = 1.0/0 # easy way to get infinite range
p lazy_merge([1, 3, 5, 8], (2..4), (6..Infinity), []).take(12)
#=> [1, 2, 3, 3, 4, 5, 6, 7, 8, 8, 9, 10]
No, there's nothing built in to do that. At least, nothing that springs instantly to mind. However, there was a GSoC project to implement the relevant data types a couple of years ago, which you could use.
精彩评论