Combining consecutive dates in IList<DateTime> into ranges
- I have a series of objects with from and to dates.
Using something like:
IList<DateTime> dates = this.DateRanges .SelectMany(r => new [] { r.From, r.To }) .Distinct() .OrderBy(d => d) .ToList();
I can get all dates without any of them being duplicated. Ranges may fully overlap, partially overlap (upper or lower overlapping), touch or they may not overlap at all.
Now I need to convert this list to a different one so that each consecutive date pair forms a new generated
DateTime
instance right in the middle of pairD1 D2 D3 D4 D5 G1 G2 G3 G4
Where Dn are my distinct dates from the list and Gm dates are ones I'd like to generate in the middle of them.
Question
How do I convert an ordered list of individual dates to pairs so that I get pairs as shown in the following example? I would like to form these using LINQ instead of for
loop which can accomplish the same thing. Using LINQ may result in and more efficient code due to delayed expression tree execution.
Additional explanation using a real-world example
Suppose t开发者_开发知识库his is my example of such ranges:
D1 D2 D3 D4 D5 D6 D11 D12
|--------------| |------| |------| |------|
D7 D8
|--------------------------|
D9 D10
|-----------------------------------------------|
First step of getting distinct dates would result in these dates:
D1 D7 D2 D3 D4 D5 D6 D10 D11 D12
D9 and D8 would fall off because they're duplicates.
Next step is to form pairs (I don't know how to do this using LINQ):
D1-D7, D7-D2, D2-D3, D3-D4, D4-D5, D5-D6, D6-D10, (D10-D11), D11-D12
Last step has to calculate a date for each pair using:
Dnew = Dfrom + (Dto - Dfrom)/2
Empty ranges issue
Range D10-D11 should preferably be omitted. But if omitting it results in over-complicates code it can be kept and excluded with a separate check afterwards. But if it can be excluded initially then that's what should be done. So if you also provide information of how to form pairs that exclude empty ranges, you're welcome to add that info as well.
You can use Zip()
:
var middleDates = dates.Zip(dates.Skip(1),
(a, b) => (a.AddTicks((b - a).Ticks / 2)))
.ToList();
Final solution
Based on the idea of @DavidB and interesting idea by @AakashM's original answer I've come up with my own solution that extracts ranges from a set of dates (while also omitting empty ranges) and calculating range middle dates.
If you have any improvement suggestions or comments on this solution you're warmly welcome to comment on it. Anyway this is the final code I'm using now (inline comments explain its functionality):
// counts range overlaps
int counter = 0;
// saves previous date to calculate midrange date
DateTime left = DateTime.Now;
// get mid range dates
IList<DateTime> dates = this.DateRanges
// select range starts and ends
.SelectMany(r => new[] {
new {
Date = r.From,
Counter = 1
},
new {
Date = r.To,
Counter = -1
}
})
// order dates because they come out mixed
.OrderBy(o => o.Date)
// convert dates to ranges; when non-empty & non-zero wide get mid date
.Select(o => {
// calculate middle date if range isn't empty and not zero wide
DateTime? result = null;
if ((counter != 0) && (left != o.Date))
{
result = o.Date.AddTicks(new DateTime((o.Date.Ticks - left.Ticks) / 2).Ticks);
}
// prepare for next date range
left = o.Date;
counter += o.Counter;
// return middle date when applicable otherwise null
return result;
})
// exclude empty and zero width ranges
.Where(d => d.HasValue)
// collect non nullable dates
.Select(d => d.Value)
.ToList();
Next step is to form pairs (I don't know how to do this using LINQ):
List<DateTime> edges = bucketOfDates
.Distinct()
.OrderBy(date => date)
.ToList();
DateTime rangeStart = edges.First(); //ps - don't forget to handle empty
List<DateRange> ranges = edges
.Skip(1)
.Select(rangeEnd =>
{
DateRange dr = new DateRange(rangeStart, rangeEnd);
rangeStart = rangeEnd;
return dr;
})
.ToList();
OK my previous idea wouldn't work. But this one will. And it's O(n)
on the number of inputs.
To solve the D10-D11 problem, we need the process to be aware of how many of the original intervals are 'in effect' at any given date. We can then iterate throw the transition points in order, and emit midpoints whenever we are between two transitions and the current state is ON. Here is complete code.
Data classes:
// The input type
class DateRange
{
public DateTime From { get; set; }
public DateTime To { get; set; }
}
// Captures details of a transition point
// along with how many ranges start and end at this point
class TransitionWithCounts
{
public DateTime DateTime { get; set; }
public int Starts { get; set; }
public int Finishes { get; set; }
}
Processing code:
class Program
{
static void Main(string[] args)
{
// Inputs as per question
var d1 = new DateTime(2011, 1, 1);
var d2 = new DateTime(2011, 3, 1);
var d3 = new DateTime(2011, 4, 1);
var d4 = new DateTime(2011, 5, 1);
var d5 = new DateTime(2011, 6, 1);
var d6 = new DateTime(2011, 7, 1);
var d11 = new DateTime(2011, 9, 1);
var d12 = new DateTime(2011, 10, 1);
var d7 = new DateTime(2011, 2, 1);
var d8 = d5;
var d9 = d1;
var d10 = new DateTime(2011, 8, 1);
var input = new[]
{
new DateRange { From = d1, To = d2 },
new DateRange { From = d3, To = d4 },
new DateRange { From = d5, To = d6 },
new DateRange { From = d11, To = d12 },
new DateRange { From = d7, To = d8 },
new DateRange { From = d9, To = d10 },
};
The first step is to capture the starts and finishes of the inputs as transition points. Each original range becomes two transition points, each with a count of 1.
// Transform into transition points
var inputWithBeforeAfter = input.SelectMany(
dateRange => new[]
{
new TransitionWithCounts { DateTime = dateRange.From, Starts = 1 },
new TransitionWithCounts { DateTime = dateRange.To, Finishes = 1 }
});
Now we group these by date, summing up how many of the original ranges started and finished at this date
// De-dupe by date, counting up how many starts and ends happen at each date
var deduped = (from bdta in inputWithBeforeAfter
group bdta by bdta.DateTime
into g
orderby g.Key
select new TransitionWithCounts
{
DateTime = g.Key,
Starts = g.Sum(bdta => bdta.Starts),
Finishes = g.Sum(bdta => bdta.Finishes)
}
);
To process this we could use Aggregate
(probably), but it's (for me) far quicker to read and write a manual iteration:
// Iterate manually since we want to keep a current count
// and emit stuff
var output = new List<DateTime>();
var state = 0;
TransitionWithCounts prev = null;
foreach (var current in deduped)
{
// Coming to a new transition point
// If we are ON, we need to emit a new midpoint
if (state > 0)
{
// Emit new midpoint between prev and current
output.Add(prev.DateTime.AddTicks((current.DateTime - prev.DateTime).Ticks / 2));
}
// Update state
state -= current.Finishes;
state += current.Starts;
prev = current;
}
We could assert that state == 0
at the end, if we felt like it.
// And we're done
foreach (var dateTime in output)
{
Console.WriteLine(dateTime);
}
// 16/01/2011 12:00:00
// 15/02/2011 00:00:00
// 16/03/2011 12:00:00
// 16/04/2011 00:00:00
// 16/05/2011 12:00:00
// 16/06/2011 00:00:00
// 16/07/2011 12:00:00
// 16/09/2011 00:00:00
// Note: nothing around 15/08 as that is between D10 and D11,
// the only midpoint where we are OFF
Console.ReadKey();
精彩评论