开发者

Identify important minima and maxima in time-series w/ Mathematica

I need a way to identify local minima and maxima in time series data with Mathematica. This seems like it should be an easy thing to do, but it gets tricky. I posted this on the MathForum, but thought I might get some additional eyes on it here.

You can find a paper that discusses the problem at: http://www.cs.cmu.edu/~eugene/research/full/compress-series.pdf

I've tried this so far…

Get and format some data:

data = FinancialData["SPY", {"May 1, 2006", "Jan. 21, 2011"}][[All, 2]];
data = data/First@data;
data = Transpose[{Range[Length@data], data}];

Define 2 functions:

First method:

findMinimaMaxima[data_, window_] := With[{k = window},
  data[[k + Flatten@Position[Partition[data[[All, 2]], 2 k + 1, 1],  x_List /;  x[[k + 1]] < Min[Delete[x, k + 1]] || x[[k + 1]] > Max[Delete[x, k + 1]]]]]]

Now another approach, although not as flexible:

findMinimaMaxima2[data_] := data[[Accumulate@(Length[#] & /@ Split[Prepend[Sign[Rest@data[[All, 2]] - Most@data[[All, 2]]], 0]])]]

Look at what each the functions does. First findMinimaMaxima2[]:

minmax = findMinimaMaxima2[data];
{Length@data, Length@minmax}
ListLinePlot@minmax

This selects all minima and maxima and results (in this instance) in about a 49% data compression, but it doesn't have the flexibility of expanding the window. This other method does. A window of 2, yields fewer and arguably more important extrema:

minmax2 = findMinimaMaxima[data, 2];
{Length@data, Length@minmax2}
ListLinePlot@minmax2

But look at what happens when we expand the window to 60:

minmax2 = findMinimaMaxima[data, 60];
ListLinePlot[{data, minmax2}]

Some of the minima and maxima no longer alternate. Applying findMinimaMaxima2[] to the output of findMinimaMaxima[] gives a workaround...

minmax3 = findMinimaMaxima2[minmax2];
ListLinePlot[{data, minmax2, minmax3}]

, but this seems like a clumsy way to address the problem.

So, the idea of using a fixed window to look left and right doesn't quite do everything one would like. I began thinking about an alternative that could use a range value R (e.g. a percent move up or down) that the function would need to meet or exceed to set the next minima or maxima. Here's my first try:

findMinimaMaxima3[data_, R_] := Module[{d, n, positions},
  d = data[[All, 2]];
  n = Transpose[{data[[All, 1]], Rest@FoldList[If[(#2 <= #1 + #1*R && #2 >= #1) || (#2 >= #1 - #1* R && #2 <= #1), #1, #2] &, d[[1]], d]}];
  n = Sign[Rest@n[[All, 2]] - Most@n[[All, 2]]];
  positions = Flatten@Rest[Most[Position[n, Except[0]]]];
  data[[positions]]
  ]

minmax4 = findMinimaMaxima3[data, 0.1];
ListLinePlot[{data, minmax4}]

This too benefits from post processing with findMinimaMaxima2[]

ListLinePlot[{data, findMinimaMaxima2[minmax4]}]

But if you look closely, you see that it misses the extremes if they go beyond the R value in several positions - including the chart's absolute minimum and maximum as well as along the big moves up and down. Changing the R value shows how it misses the top and bottoms even more:

minmax4 = findMinimaMaxima3[data, 0.15];
ListLinePlot[{data, minmax4}]

So, I need to reconsider. Anyone can look at a plot of the data and easily identify the important minima and maxima. It seems harder to get an algorithm to do it. A window and/or an R value seem important to the solution, but neither on their own seems enough (at least not in the approaches above)开发者_JAVA技巧.

Can anyone extend any of the approaches shown or suggest an alternative to identifying the important minima and maxima?

Happy to forward a notebook with all of this code and discussion in it. Let me know if anyone needs it.

Thank you, Jagra


I suggest to use an iterative approach. The following functions are taken from this post, and while they can be written more concisely without Compile, they'll do the job:

localMinPositionsC = 
 Compile[{{pts, _Real, 1}}, 
   Module[{result = Table[0, {Length[pts]}], i = 1, ctr = 0}, 
    For[i = 2, i < Length[pts], i++, 
     If[pts[[i - 1]] > pts[[i]] && pts[[i + 1]] > pts[[i]], 
      result[[++ctr]] = i]];
    Take[result, ctr]]];

localMaxPositionsC = 
  Compile[{{pts, _Real, 1}}, 
    Module[{result = Table[0, {Length[pts]}], i = 1, ctr = 0}, 
      For[i = 2, i < Length[pts], i++, 
        If[pts[[i - 1]] < pts[[i]] && pts[[i + 1]] < pts[[i]], 
          result[[++ctr]] = i]];
       Take[result, ctr]]];

Here is your data plot:

dplot = ListLinePlot[data]

Here we plot the mins, which are obtained after 3 iterations:

mins = ListPlot[Nest[#[[localMinPositionsC[#[[All, 2]]]]] &, data, 3],
   PlotStyle -> Directive[PointSize[0.015], Red]]

The same for maxima:

maxs = ListPlot[Nest[#[[localMaxPositionsC[#[[All, 2]]]]] &, data, 3],
   PlotStyle -> Directive[PointSize[0.015], Green]]

And the resulting plot:

Show[{dplot, mins, maxs}]

Identify important minima and maxima in time-series w/ Mathematica

You may vary the number of iterations, to get more coarse-grained or finer minima/maxima.

Edit:

actually, I just noticed that a couple of points were still missed by this method, both for the minima and maxima. So, I suggest it as a starting point, not as a complete solution. Perhaps, you could analyze minima/maxima, coming from different iterations, and sometimes include those from a "previous", more fine-grained one. Also, the only "physical reason" that this kind of works, is that the nature of the financial data appears to be fractal-like, with several distinctly different scales. Each iteration in the above Nest-s targets a particular scale. This would not work so well for an arbitrary signal.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜