开发者

WHERE aequivalent in R, multiplication conditional on another column of the same df

i am trying to run a simple multiplication of a data.frame column wi开发者_如何学编程th a scalar A respectively scalar B based on the value of third column (id) of the same data.frame. Somehow I have some (order,sort?) problem – so far the result is definitely wrong. Here are several tries:

mydf$result = subset(mydf,myid==123,multiplyme)*0.6 +
subset(mydf,myid==124,,multiplyme)*0.4

I also tried to use %in% syntax but was not successful either. I know I could MySQL for example and connect to R, but in this case I just want to use (basic) R or plyr at least here. Just for those of you who prefer code over my blabla, here´s how i´d do it in SQL:

SELECT
MIN(CASE WHEN myid=123 THEN multiplyme*0.6 END)
MIN(CASE WHEN myid=124 THEN multiplyme*0.4 END)
FROM mytable
GROUP BY result;

Thx for any help / R-code suggestions in advance! Please note that I have more than 2 ids!


Assuming you only have 123 or 124 in myid:

mydf$result <- mydf$multiplyme * ifelse(mydf$myid==123,0.6,0.4)

If you have other variables in myid add an extra ifelse and a default case.

EDIT:

Since you have extra variables in myid, I'll state the expansion.

mydf$result <- mydf$multiplyme * ifelse(mydf$myid==123,0.6,ifelse(mydf$myid==124,0.4,0))

You can change the 0 at the end to a 1 if in the defualt case you want to keep the value of multiplyme. This can be extended into a chain of ifelse statements if you want to use a different multiple for many values.

However, as mbq comments below, you can use a switch statement if it begins to get unwieldy:

mydf$result <- mydf$multiplyme * sapply(mydf$myid,function(x) switch(as.character(x),"123"=0.6,"124"=0.4))

This would probably be slower though, as this will loop while ifelse is vectorised.


The command should be:

subset(mydf,myid==123,multiplyme)

or

mydf$multiplyme[mydf$myid==123]

The equivalent SQL command is:

min(mydf$multiplyme[mydf$myid==123]*0.6)+min(mydf$multiplyme[mydf$myid==124]*0.4)


If you really have two values of myid then ifelse is a simple solution:

> mydf<-data.frame(multiplyme=c(1,2,3,4),myid=c(123,124,124,123))
> with(mydf,multiplyme*ifelse(myid==123,0.6,0.4))
[1] 0.6 0.8 1.2 2.4

For a small number of possible values of myid you can use nested calls to ifelse. But merge provides a cleaner option if myid can take many possible values:

> multdf<-data.frame(myid=c(123,124),m=c(0.6,0.4))
> mydf<-merge(mydf,multdf)
> mydf
  myid multiplyme   m
1  123          1 0.6
2  123          4 0.6
3  124          2 0.4
4  124          3 0.4
> with(mydf,multiplyme*m)
[1] 0.6 2.4 0.8 1.2

Note that merge rearranges the rows, so you may want to have variables or row names to identify observations.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜