How to combine overlapping time ranges (time ranges union)
I have an array with several time ranges inside:
[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
Tue, 24 May 2011 16:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00,
Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 09:00:00 CEST +02:00,
Tue, 24 May 2011 15:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]
I want to get the same array with the overlapping time ranges combined, so the output for this case will be:
[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
Tue, 24 May 2011 15:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]
So it creates a new time range when to time ranges overlap, and so on. If they don´t overlap the will be keep separated. Another example:
开发者_C百科Input:
[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
Tue, 24 May 2011 16:00:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]
Output (will be the same because they don´t overlap):
[Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
Tue, 24 May 2011 16:00:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]
I was thinking in some recursive approach, but I need some guidance here...
Given a function that returns truthy if two ranges overlap:
def ranges_overlap?(a, b)
a.include?(b.begin) || b.include?(a.begin)
end
(this function courtesy of sepp2k and steenslag)
and a function that merges two overlapping ranges:
def merge_ranges(a, b)
[a.begin, b.begin].min..[a.end, b.end].max
end
then this function, given an array of ranges, returns a new array with any overlapping ranges merged:
def merge_overlapping_ranges(overlapping_ranges)
overlapping_ranges.sort_by(&:begin).inject([]) do |ranges, range|
if !ranges.empty? && ranges_overlap?(ranges.last, range)
ranges[0...-1] + [merge_ranges(ranges.last, range)]
else
ranges + [range]
end
end
end
Searching a little bit I have found a code that does the trick:
def self.merge_ranges(ranges)
ranges = ranges.sort_by {|r| r.first }
*outages = ranges.shift
ranges.each do |r|
lastr = outages[-1]
if lastr.last >= r.first - 1
outages[-1] = lastr.first..[r.last, lastr.last].max
else
outages.push(r)
end
end
outages
end
A sample (working with time ranges too!):
ranges = [1..5, 20..20, 4..11, 40..45, 39..50]
merge_ranges(ranges)
=> [1..11, 20..20, 39..50]
Found here: http://www.ruby-forum.com/topic/162010
You can do it by using multi_range gem.
Example 1:
ranges = [
Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 13:00:00 CEST +02:00'),
Time.parse('Tue, 24 May 2011 16:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 09:00:00 CEST +02:00'),
Time.parse('Tue, 24 May 2011 15:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
]
MultiRange.new(ranges).merge_overlaps.ranges
# => [2011-05-24 08:00:00 +0800..2011-05-24 13:00:00 +0800, 2011-05-24 15:30:00 +0800..2011-05-24 18:00:00 +0800]
Example 2:
ranges = [
Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00')..Time.parse('Tue, 24 May 2011 13:00:00 CEST +02:00'),
Time.parse('Tue, 24 May 2011 16:00:00 CEST +02:00')..Time.parse('Tue, 24 May 2011 18:00:00 CEST +02:00'),
]
MultiRange.new(ranges).merge_overlaps.ranges
# => [2011-05-24 08:00:00 +0800..2011-05-24 13:00:00 +0800, 2011-05-24 16:00:00 +0800..2011-05-24 18:00:00 +0800]
The facets gem has Range.combine
method that may be of use: http://rdoc.info/github/rubyworks/facets/master/Range#combine-instance_method
Some kind of algorithm that might help:
Sort range array by start time (r1, r2, r3, r4, .. rn)
for each range pair [r1, r2], [r2, r3] .. [rn-1, rn]:
if r1_end > r2_start: # they overlap
add [r1_start, r2_end] to new range array
else: # they do not overlap
add [r1] and [r2] to new range array (no changes)
startover with the new range array until no more changes
The solution, offered by @wayne-conrad is a very good one. I implemented it for a problem, I stumbled upon. Then I implemented an iterative version and benchmarked the two. It appears, the iterative version is quicker. Note: I use ActiveSupport
for Range#overlaps?
and the time helpers, but it is trivial to implement a pure-Ruby version.
require 'active_support/all'
module RangesUnifier
extend self
# ranges is an array of ranges, e.g. [1..5, 2..6]
def iterative_call(ranges)
ranges.sort_by(&:begin).reduce([ranges.first]) do |merged_ranges, range|
if merged_ranges.last.overlaps?(range)
merged_ranges[0...-1] << merge_ranges(merged_ranges.last, range)
else
merged_ranges << range
end
end
end
def recursive_call(ranges)
return ranges if ranges.size == 1
if ranges[0].overlaps?(ranges[1])
recursive_call [merge_ranges(ranges[0], ranges[1]), *ranges[2..-1]]
else
[ranges[0], *recursive_call(ranges[1..-1])]
end
end
def merge_ranges(a, b)
[a.begin, b.begin].min..[a.end, b.end].max
end
end
five_hours_ago = 5.hours.ago
four_hours_ago = 4.hours.ago
three_hours_ago = 3.hours.ago
two_hours_ago = 2.hours.ago
one_hour_ago = 1.hour.ago
one_hour_from_now = 1.hour.from_now
two_hours_from_now = 2.hours.from_now
three_hours_from_now = 3.hours.from_now
four_hours_from_now = 4.hours.from_now
five_hours_from_now = 5.hours.from_now
input = [
five_hours_ago..four_hours_ago,
three_hours_ago..two_hours_from_now,
one_hour_ago..one_hour_from_now,
one_hour_from_now..three_hours_from_now,
four_hours_from_now..five_hours_from_now
]
RangesUnifier.iterative_call(input)
#=> [
# 2017-08-21 12:50:50 +0300..2017-08-21 13:50:50 +0300,
# 2017-08-21 14:50:50 +0300..2017-08-21 20:50:50 +0300,
# 2017-08-21 21:50:50 +0300..2017-08-21 22:50:50 +0300
# ]
RangesUnifier.recursive_call(input)
#=> [
# 2017-08-21 12:50:50 +0300..2017-08-21 13:50:50 +0300,
# 2017-08-21 14:50:50 +0300..2017-08-21 20:50:50 +0300,
# 2017-08-21 21:50:50 +0300..2017-08-21 22:50:50 +0300
# ]
n = 100_000
Benchmark.bm do |x|
x.report('iterative') { n.times { RangesUnifier.iterative_call(input) } }
x.report('recursive') { n.times { RangesUnifier.recursive_call(input) } }
end
# =>
# user system total real
# iterative 0.970000 0.000000 0.970000 ( 0.979549)
# recursive 0.540000 0.010000 0.550000 ( 0.546755)
The gem range_operators does a wonderful job by adding missing features to the Ruby Range
class. It is way smaller than adding the whole facets gem.
I your case the solution would be the rangify
method, which is added to the Array
class and would do exactly what you are looking for.
I made a minor update to the answer from Wayne Conrad to handle edge cases involved with open-ended arrays (created with ... operator instead of .. operator).
I changed the name to merge_continuous_ranges
since while ranges like 0...1
and 1..2
do not overlap, their combined ranges are continuous, so it makes sense to combine them:
def merge_continuous_ranges(ranges)
ranges.sort_by(&:begin).inject([]) do |result, range|
if !result.empty? && ranges_continuous?(result.last, range)
result[0...-1] + [merge_ranges(result.last, range)]
else
result + [range]
end
end
end
def ranges_continuous?(a, b)
a.include?(b.begin) || b.include?(a.begin) || a.end == b.begin || b.end == a.begin
end
def merge_ranges(a, b)
range_begin = [a.begin, b.begin].min
range_end = [a.end, b.end].max
exclude_end = case a.end <=> b.end
when -1
b.exclude_end?
when 0
a.exclude_end? && b.exclude_end?
when 1
a.exclude_end?
end
exclude_end ? range_begin...range_end : range_begin..range_end
end
A solution in one method and bug-free for what I can tell:
def merge_ranges(ranges)
ranges = ranges.sort_by(&:first)
merged = [ranges[0]]
ranges.each do |current|
previous = merged[-1]
if current.first <= previous.last
merged[-1] = previous.first..[previous.last, current.last].max
else
merged.push(current)
end
end
merged
end
Usage:
ranges = [
Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 13:00:00 CEST +02:00'),
Time.parse('Tue, 24 May 2011 16:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
Time.parse('Tue, 24 May 2011 08:00:00 CEST +02:00..Tue')..Time.parse('24 May 2011 09:00:00 CEST +02:00'),
Time.parse('Tue, 24 May 2011 15:30:00 CEST +02:00..Tue')..Time.parse('24 May 2011 18:00:00 CEST +02:00'),
]
merge_ranges(ranges)
#=> [2011-05-24 08:00:00 +0200..2011-05-24 13:00:00 +0200, 2011-05-24 15:30:00 +0200..2011-05-24 18:00:00 +0200]
Disclaimer: it's a port of https://stackoverflow.com/a/43600953/807442
Dont you just want to find the smallest first value and the largest last value from the set of arrays?
ranges = [Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 13:00:00 CEST +02:00,
Tue, 24 May 2011 16:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00,
Tue, 24 May 2011 08:00:00 CEST +02:00..Tue, 24 May 2011 09:00:00 CEST +02:00,
Tue, 24 May 2011 15:30:00 CEST +02:00..Tue, 24 May 2011 18:00:00 CEST +02:00]
union = [ranges.collect(&:first).sort.first, ranges.collect(&:last).sort.last]
The Marked answer works well except for few use cases. One of such use case is
[Tue, 21 June 13:30:00 GMT +0:00..Tue, 21 June 15:30:00 GMT +00:00,
Tue, 21 June 14:30:00 GMT +0:00..Tue, 21 June 15:30:00 GMT +00:00]
The condition in ranges_overlap
does not handle this use case. So I wrote this
def ranges_overlap?(a, b)
a.include?(b.begin) || b.include?(a.begin) || a.include?(b.end) || b.include?(a.end)|| (a.begin < b.begin && a.end >= b.end) || (a.begin >= b.begin && a.end < b.end)
end
This is handling all the edge cases for me so far.
精彩评论