开发者

Combining consecutive dates in IList<DateTime> into ranges

  1. I have a series of objects with from and to dates.
  2. Using something like:

    IList<DateTime> dates =
        this.DateRanges
            .SelectMany(r => new [] { r.From, r.To })
            .Distinct()
            .OrderBy(d => d)
            .ToList();
    

    I can get all dates without any of them being duplicated. Ranges may fully overlap, partially overlap (upper or lower overlapping), touch or they may not overlap at all.

  3. Now I need to convert this list to a different one so that each consecutive date pair forms a new generated DateTime instance right in the middle of pair

    D1      D2      D3              D4  D5
        G1      G2          G3        G4
    

    Where Dn are my distinct dates from the list and Gm dates are ones I'd like to generate in the middle of them.

Question

How do I convert an ordered list of individual dates to pairs so that I get pairs as shown in the following example? I would like to form these using LINQ instead of for loop which can accomplish the same thing. Using LINQ may result in and more efficient code due to delayed expression tree execution.

Additional explanation using a real-world example

Suppose t开发者_开发知识库his is my example of such ranges:

D1             D2     D3     D4   D5     D6          D11    D12
|--------------|      |------|    |------|           |------|
       D7                         D8
       |--------------------------|
D9                                              D10
|-----------------------------------------------|

First step of getting distinct dates would result in these dates:

D1     D7      D2     D3     D4   D5     D6     D10  D11    D12

D9 and D8 would fall off because they're duplicates.

Next step is to form pairs (I don't know how to do this using LINQ):

D1-D7, D7-D2, D2-D3, D3-D4, D4-D5, D5-D6, D6-D10, (D10-D11), D11-D12

Last step has to calculate a date for each pair using:

Dnew = Dfrom + (Dto - Dfrom)/2

Empty ranges issue

Range D10-D11 should preferably be omitted. But if omitting it results in over-complicates code it can be kept and excluded with a separate check afterwards. But if it can be excluded initially then that's what should be done. So if you also provide information of how to form pairs that exclude empty ranges, you're welcome to add that info as well.


You can use Zip():

var middleDates = dates.Zip(dates.Skip(1), 
                            (a, b) => (a.AddTicks((b - a).Ticks / 2)))
                       .ToList();


Final solution

Based on the idea of @DavidB and interesting idea by @AakashM's original answer I've come up with my own solution that extracts ranges from a set of dates (while also omitting empty ranges) and calculating range middle dates.

If you have any improvement suggestions or comments on this solution you're warmly welcome to comment on it. Anyway this is the final code I'm using now (inline comments explain its functionality):

// counts range overlaps
int counter = 0;

// saves previous date to calculate midrange date
DateTime left = DateTime.Now;

// get mid range dates
IList<DateTime> dates = this.DateRanges

    // select range starts and ends
    .SelectMany(r => new[] {
        new {
            Date = r.From,
            Counter = 1
        },
        new {
            Date = r.To,
            Counter = -1
        }
    })

    // order dates because they come out mixed
    .OrderBy(o => o.Date)

    // convert dates to ranges; when non-empty & non-zero wide get mid date
    .Select(o => {

        // calculate middle date if range isn't empty and not zero wide
        DateTime? result = null;
        if ((counter != 0) && (left != o.Date))
        {
            result = o.Date.AddTicks(new DateTime((o.Date.Ticks - left.Ticks) / 2).Ticks);
        }

        // prepare for next date range
        left = o.Date;
        counter += o.Counter;

        // return middle date when applicable otherwise null
        return result;
    })

    // exclude empty and zero width ranges
    .Where(d => d.HasValue)

    // collect non nullable dates
    .Select(d => d.Value)
    .ToList();


Next step is to form pairs (I don't know how to do this using LINQ):

        List<DateTime> edges = bucketOfDates
            .Distinct()
            .OrderBy(date => date)
            .ToList();

        DateTime rangeStart = edges.First(); //ps - don't forget to handle empty
        List<DateRange> ranges = edges
            .Skip(1)
            .Select(rangeEnd =>
            {
              DateRange dr = new DateRange(rangeStart, rangeEnd);
              rangeStart = rangeEnd;
              return dr;
            })
            .ToList();


OK my previous idea wouldn't work. But this one will. And it's O(n) on the number of inputs.

To solve the D10-D11 problem, we need the process to be aware of how many of the original intervals are 'in effect' at any given date. We can then iterate throw the transition points in order, and emit midpoints whenever we are between two transitions and the current state is ON. Here is complete code.

Data classes:

// The input type
class DateRange
{
    public DateTime From { get; set; }
    public DateTime To { get; set; }
}

// Captures details of a transition point
// along with how many ranges start and end at this point
class TransitionWithCounts
{
    public DateTime DateTime { get; set; }
    public int Starts { get; set; }
    public int Finishes { get; set; }
}

Processing code:

class Program
{
    static void Main(string[] args)
    {
        // Inputs as per question
        var d1 = new DateTime(2011, 1, 1);
        var d2 = new DateTime(2011, 3, 1);
        var d3 = new DateTime(2011, 4, 1);
        var d4 = new DateTime(2011, 5, 1);
        var d5 = new DateTime(2011, 6, 1);
        var d6 = new DateTime(2011, 7, 1);
        var d11 = new DateTime(2011, 9, 1);
        var d12 = new DateTime(2011, 10, 1);
        var d7 = new DateTime(2011, 2, 1);
        var d8 = d5;
        var d9 = d1;
        var d10 = new DateTime(2011, 8, 1);

        var input = new[]
        {
            new DateRange { From = d1, To = d2 },
            new DateRange { From = d3, To = d4 },
            new DateRange { From = d5, To = d6 },
            new DateRange { From = d11, To = d12 },
            new DateRange { From = d7, To = d8 },
            new DateRange { From = d9, To = d10 },
        };

The first step is to capture the starts and finishes of the inputs as transition points. Each original range becomes two transition points, each with a count of 1.

        // Transform into transition points
        var inputWithBeforeAfter = input.SelectMany(
            dateRange => new[]
                {
                    new TransitionWithCounts { DateTime = dateRange.From, Starts = 1 },
                    new TransitionWithCounts { DateTime = dateRange.To, Finishes = 1 }
                });

Now we group these by date, summing up how many of the original ranges started and finished at this date

        // De-dupe by date, counting up how many starts and ends happen at each date
        var deduped = (from bdta in inputWithBeforeAfter
                      group bdta by bdta.DateTime
                      into g
                      orderby g.Key
                      select new TransitionWithCounts
                                 {
                                     DateTime = g.Key,
                                     Starts = g.Sum(bdta => bdta.Starts),
                                     Finishes = g.Sum(bdta => bdta.Finishes)
                                 }
                      );

To process this we could use Aggregate (probably), but it's (for me) far quicker to read and write a manual iteration:

        // Iterate manually since we want to keep a current count
        // and emit stuff
        var output = new List<DateTime>();
        var state = 0;
        TransitionWithCounts prev = null;

        foreach (var current in deduped)
        {
            // Coming to a new transition point
            // If we are ON, we need to emit a new midpoint
            if (state > 0)
            {
                // Emit new midpoint between prev and current
                output.Add(prev.DateTime.AddTicks((current.DateTime - prev.DateTime).Ticks / 2));
            }

            // Update state
            state -= current.Finishes;
            state += current.Starts;

            prev = current;
        }

We could assert that state == 0 at the end, if we felt like it.

        // And we're done
        foreach (var dateTime in output)
        {
            Console.WriteLine(dateTime);
        }

        // 16/01/2011 12:00:00
        // 15/02/2011 00:00:00
        // 16/03/2011 12:00:00
        // 16/04/2011 00:00:00
        // 16/05/2011 12:00:00
        // 16/06/2011 00:00:00
        // 16/07/2011 12:00:00
        // 16/09/2011 00:00:00

        // Note: nothing around 15/08 as that is between D10 and D11,
        // the only midpoint where we are OFF

        Console.ReadKey();
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜