开发者

Timeseries transformations in Ruby, Yahoo! Pipes-style

I'm trying to build a system for programmatically filtering timeseries data and wonder if this problem has been solved, or at least hacked at, before. It seems that it's a perfect opportunity to do some R开发者_开发知识库uby block magic, given the scope and passing abilities; however, I'm still a bit short of fully grokking how to take advantage of blocks.

To wit:

Pulling data from my database, I can create either a hash or an array, let's use array:

data = [[timestamp0, value0],[timestamp1,value1], … [timestampN, valueN]]

Then I can add a method to array, maybe something like:

class Array
  def filter &block
    …
    self.each_with_index do |v, i|
      …
      # Always call with timestep, value, index
      block.call(v[0], v[1], i)
      …
    end
  end
end

I understand that one powers of Ruby blocks is that the passed block of code happens within the scope of the closure. So somehow calling data.filter should allow me to work with that scope. I can only figure out how to do that without taking advantage of the scope. To wit:

# average if we have a single null value, assumes data is correctly ordered
data.filter do |t, v, i|
  # Of course, we do some error checking…
  (data[i-1] + data[i+1]) / 2 if v.nil?
end

What I want to do is actually is (allow the user to) build up mathematical filters programmatically, but taking it one step at a time, we'll build some functions:

def average_single_values(args)
  #average over single null values
  #return filterable array
end

def filter_by_std(args)
  #limit results to those within N standard deviations
  #return filterable array
end

def pull_bad_values(args)
  #delete or replace values seen as "bad"
  #return filterable array
end

my_filters == [average_single_values, filter_by_std, pull_bad_values]

Then, having a list of filters, I figure (somehow) I should be able to do:

data.filter do |t, v, i|
  my_filters.each do |f|
    f.call t, v, i
  end
end

or, assuming a different filter implementation:

filtered_data = data.filter my_filters

which would probably be a better way to design it, as it returns a new array and is non-destructive

The result being an array that has been run through all of the filters. The eventual goal, is to be able to have static data arrays that can be run through arbitrary filters, and filters that can be passed (and shared) as objects the way that Yahoo! Pipes does so with feeds. I'm not looking for too generalized a solution right now, I can make the format of the array/returns strict.

Has anyone seen something similar in Ruby? Or have some basic pointers?


The first half of your question about working in the scope of the array seems unnecessary and irrelevant to your problem. As for creating operations to manipulate data with blocks, you can use Proc instances ("procs"), which essentially are blocks stored in an object. For example, if you want to store them with names, you can create a hash of filters:

my_filters = {}

my_filters[:filter_name] = lambda do |*args|
  # filter body here...
end

You do not need to name them, of course, and can use arrays. Then, to run some data through an ordered series of filters, use the helpful Enumerable#inject method:

my_filters.inject(data) do |result, filter|
  filter.call result
end

It uses no monkeypatching too!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜