开发者

Detecting change in streaming data

I have a streaming set 开发者_开发知识库of values that I would like to analyze for abrupt changes and possibly ignore spikes/noise in the data. I've looked at moving averages, winsorised means and several other possible solutions including PID controllers in control systems, the colt library and numpy for clues as to how to solve this.

A sample dataset is below.

22.0, 22.0, 22.0, 22.0, 20.8806130178211, 20.8806130178211, 20.8806130178211, 20.8806130178211, 20.8806130178211, 20.8806130178211, 21.840329667841555, 21.840329667841555, 20.8806130178211, 20.8806130178211, 20.8806130178211,20.8806130178211, 20.8806130178211, 20.8806130178211, 21.840329667841555, 21.840329667841555, 21.840329667841555,21.840329667841555, 22.80350850198276

Ideally I would like to detect that the values change in the 1st, 3rd and 4th sections in bold. The second section can be treated like a spike.

Looking for an elegant mathematical/algorithmic solution that works like a moving average in that if the data does not change for a long time (a window that is dynamic) it will ignore old data. In the case of the above data the initial values of 22 are ignored when considering the next window of data that is 20.8806130178211.

The solution (program/class) should be able to accept a new data input value (22.0232) and return a value of true or false if it computes that the value is within the acceptable range i.e. it hasn't changed considerably.

Thanks

sfk


Perhaps a better approach than looking at the moving average in your data is looking at the moving average of the change in your data. So you could take the first difference of your dataset and identify values greater than some threshold.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜