Data manipulation in R in LINQ style
I'm interested if there's a package in R to support call-chain style data manipulation, like in C#/LINQ, F#? I want to enable style like this:
var list = new[] {1,5,10,12,1};
var newList = list
.Where(x => x > 5)
.GroupBy(x => x%2)
.OrderBy(x => x.Key.ToString())
.Select(x => "Group: " + x.Key)
开发者_开发百科 .ToArray();
I don't know of one, but here's the start of what it could look like:
`%then%` = function(x, body) {
x = substitute(x)
fl = as.list(substitute(body))
car = fl[[1L]]
cdr = {
if (length(fl) == 1)
list()
else
fl[-1L]
}
combined = as.call(
c(list(car, x), cdr)
)
eval(combined, parent.frame())
}
df = data.frame(x = 1:7)
df %then% subset(x > 2) %then% print
This prints
x
3 3
4 4
5 5
6 6
7 7
If you keep using hacks like that it should be pretty simple to get the kind of syntax you find pleasing ;-)
edit: combined with plyr
, this becomes not bad at all:
(data.frame(
x = c(1, 1, 1, 2, 2, 2),
y = runif(6)
)
%then% subset(y > 0.2)
%then% ddply(.(x), summarize,
ysum = sum(y),
ycount = length(y)
)
%then% print
)
dplyr chaining syntax resembles LINQ (stock example):
flights %>%
group_by(year, month, day) %>%
select(arr_delay, dep_delay) %>%
summarise(
arr = mean(arr_delay, na.rm = TRUE),
dep = mean(dep_delay, na.rm = TRUE)
) %>%
filter(arr > 30 | dep > 30)
Introduction to dplyr - Chaining
(Not an answer. More an extended comment on Owen's answer.] Owen's answer helped me understand what you were after and I thoroughly enjoyed reading his insightful answer. This "outside to inside" style reminded me of an example on the help(Reduce) page where the Funcall function is defined and then successively applied:
## Iterative function application:
Funcall <- function(f, ...) f(...)
## Compute log(exp(acos(cos(0))
Reduce(Funcall, list(log, exp, acos, cos), 0, right = TRUE)
What I find especially intriguing about Owen's macro is that it essentially redefines the argument processing of existing functions. I tried thinking of how I might provide arguments to the "interior" functions for the Funcall aproach and then realized that his %then% function had already sorted that task out. He was using the function names without their leftmost arguments but with all their other right-hand arguments. Brilliant!
https://github.com/slycoder/Rpipe
c(1,1,1,6,4,3) %|% sort() %|% unique()
# result => c(1,3,4)
Admittedly, it would be nice to have a where
function here, or alternatively to allow anonymous functions to be passed in, but hey the source code is there: fork it and add it if you want.
精彩评论