开发者

Data manipulation in R in LINQ style

I'm interested if there's a package in R to support call-chain style data manipulation, like in C#/LINQ, F#? I want to enable style like this:

var list = new[] {1,5,10,12,1};
var newList = list
  .Where(x => x > 5)
  .GroupBy(x => x%2)
  .OrderBy(x => x.Key.ToString())
  .Select(x => "Group: " + x.Key)
开发者_开发百科  .ToArray();


I don't know of one, but here's the start of what it could look like:

`%then%` = function(x, body) {
    x = substitute(x)
    fl = as.list(substitute(body))
    car = fl[[1L]]
    cdr = {
        if (length(fl) == 1)
            list()
        else
            fl[-1L]
    }
    combined = as.call(
        c(list(car, x), cdr)
    )
    eval(combined, parent.frame())
}

df = data.frame(x = 1:7)
df %then% subset(x > 2) %then% print

This prints

  x
3 3
4 4
5 5
6 6
7 7

If you keep using hacks like that it should be pretty simple to get the kind of syntax you find pleasing ;-)

edit: combined with plyr, this becomes not bad at all:

(data.frame(
    x = c(1, 1, 1, 2, 2, 2),
    y = runif(6)
)
    %then% subset(y > 0.2)
    %then% ddply(.(x), summarize,
            ysum   = sum(y),
            ycount = length(y)
        )
    %then% print
)


dplyr chaining syntax resembles LINQ (stock example):

flights %>%
  group_by(year, month, day) %>%
  select(arr_delay, dep_delay) %>%
  summarise(
    arr = mean(arr_delay, na.rm = TRUE),
    dep = mean(dep_delay, na.rm = TRUE)
  ) %>%
  filter(arr > 30 | dep > 30)

Introduction to dplyr - Chaining


(Not an answer. More an extended comment on Owen's answer.] Owen's answer helped me understand what you were after and I thoroughly enjoyed reading his insightful answer. This "outside to inside" style reminded me of an example on the help(Reduce) page where the Funcall function is defined and then successively applied:

## Iterative function application:
Funcall <- function(f, ...) f(...)
## Compute log(exp(acos(cos(0))
Reduce(Funcall, list(log, exp, acos, cos), 0, right = TRUE)

What I find especially intriguing about Owen's macro is that it essentially redefines the argument processing of existing functions. I tried thinking of how I might provide arguments to the "interior" functions for the Funcall aproach and then realized that his %then% function had already sorted that task out. He was using the function names without their leftmost arguments but with all their other right-hand arguments. Brilliant!


https://github.com/slycoder/Rpipe

c(1,1,1,6,4,3) %|% sort() %|% unique()
# result => c(1,3,4)

Admittedly, it would be nice to have a where function here, or alternatively to allow anonymous functions to be passed in, but hey the source code is there: fork it and add it if you want.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜