Merging time series in mathematica efficiently
what i am trying to accomplish seems ordinary enough for there to be an efficient solution.
I am using mathematica and i have a number of different timeseries of the type {{date1, value1},{date1, value1}...} - the sort you could pass to DateListPlot.
However, the problem is that 开发者_开发知识库these datasets only partially overlap (some may have data from 95-2004, some from 1999 to 2011 and so on)
Now what I would love to be able to do is to merge these into one big list with a common timeline that is the Union[] of all the dates available. Then there would be arrays for the values, but with zeros where there is no data.
Is there an efficient way to accomplish this? I have hundreds of these timeseries and making something that loops the whole thing is probably not very efficient (and even quite tedious to make)
any help is greatly appreciated!
For instance,
ClearAll[l1, l2];
l1 = {{date1, value1}, {date1, value2}, {date2, value3}, {date4, value4}}
l2 = {{date3, value5}, {date4, value5}, {date1, value6}}
then
DeleteDuplicates[Union[l1, l2], #1[[1]] \[Equal] #2[[1]] &]
yields {{date1, value1}, {date2, value3}, {date3, value5}, {date4, value4}}
. This means that if you have two data points for the same date, and they are different, one will be lost. It's not clear (to me) if this is what you need or not, so perhaps you could add more detail.
On the other hand, this
Transpose[{DeleteDuplicates[Last@Last@Reap@Scan[Sow[#[[1]]] &, Union[l1, l2]]],
Last@Reap[Scan[Sow[#[[2]], #[[1]]] &, Union[l1, l2]]]}]
eliminates duplicate headers and collects the values under each header thus:
{{date1, {value1, value2, value6}},
{date2, {value3}},
{date3, {value5}},
{date4, {value4, value5}}}
(ie, it collects all values for each date).
Some examples of what you want would be nice.
If I understand your question correctly, you want
l1 = {{date1, value1}, {date1, value2}, {date2, value3}, {date4, value4}}
l2 = {{date3, value5}, {date4, value5}, {date5, value6}}
To become
l1 = {{date1, value1}, {date1, value2},
{date2, value3}, {date3, 0}, {date4, value4}, {date5,0}}
l2 = {{date1, 0}, {date2, 0}, {date3, value5}, {date4, value5}, {date5, value6}}
If so, something like this might work:
If[MemberQ[l1[[All,1]],#],Cases[l1,{#,_}],{#,0}]& /@ Union[l1[[All,1]],l2[[All,2]] ]
Depending on how you want multiple data points on the same date in a given series to be treated, you might need to precede the Cases[] function with Sequence @@ or First@, e.g.
If[MemberQ[l1[[All,1]],#],Sequence @@ Cases[l1,{#,_}],{#,0}]& /@
Union[l1[[All,1]],l2[[All,1]] ]
I'm home now, so this one has been checked for syntax errors :-)
Thanks guys. I ended up myself doing the solution that i took the union of all timelines. Saving that in let's say daterange i then used Mapthread in the following way
daterange= Union[DatesOfFirstTimeseries,DatesOfSecondTimeseries];
NewVersionOfFirstTimeSeries = (daterange /.
MapThread[Rule, {DatesOfFirstTimeseries, ValuesOfFirstTimeseries}] /.
MapThread[
Rule, {daterange, Table[Indeterminate, {Length[daterange]}]}]);
NewVersionOfSecondTimeSeries = (daterange /.
MapThread[Rule, {DatesOfSecondTimeseries, ValuesOfSecondTimeseries}] /.
MapThread[
Rule, {daterange, Table[Indeterminate, {Length[daterange]}]}]);
tjis did what i need, but it really does hurt my aesthetic view of things.
精彩评论