开发者

Matching Columns, Creating Loop in R

I have t开发者_开发知识库he following question:

I have data frame which looks like this. I have prices, 3 X's and 2 R's.

Date    Name  Price  Interest
01.02.10 X  120     0.2
01.02.10 R  120     0.3
01.02.10 X  130     0.8
01.02.10 X  140     0.4
01.02.10 R  130     0.2
etc.

I would like to tell R to look for pairs of X&Rs with the same price, and delete the rest. So this should result: 2 X's and 2'Rs (in this case).

Date    Name  Price  Interest
01.02.10 X  120     0.2
01.02.10 R  120     0.3
01.02.10 X  130     0.8
01.02.10 R  130     0.2
etc.

To make it clearer (hopefully): I have a lot of different prices for each date. Each row either has an X or an R in it. There are a lot of pairs on each date, i.e. for example X, Price = 120 & R, Price = 120 on Date 1. But there are also Prices which only match one Name, for example there is a Price = 140 only for Name = X. So what i would like R to do is: check for machting Names for one Price (i.e. there exists the same Price for one X and one R) and delete the rest. What actually would result is the same number of X's and R's because I'm looking for pairs.

I'm sorry not to be able to post something I tried. I just couldn't think of anything.

Now, to the next problem: If the pairs are there, I would like to tell R to check each line. If the Name is X, I want it to calculate a new price, if not just print the existing price. I tried

xx <- if(Name == "X"){Price + 100*interest} else print{Price}

but it didn't work.

Thanks for help

Cheers Dani


Edit: @Dwin's comment to the Q was a bit cryptic, and seeing as my first attempt at part 1 of the Q was not correct due to the unclear Q, I'll try to redeem myself with a go at expanding on DWin's comment:

[Assuming dat contains the data you quote in the Q.] First, merge dat with itself:

> foo <- merge(dat[, -4], dat, by.x = "Date", by.y = "Date")
> head(foo)
      Date Name.x Price.x Name.y Price.y Interest
1 01.02.10      X     120      X     120      0.2
2 01.02.10      X     120      R     120      0.2
3 01.02.10      X     120      X     130      0.2
4 01.02.10      X     120      X     140      0.2
5 01.02.10      X     120      R     130      0.2
6 01.02.10      R     120      X     120      0.2

Next, get out the rows where Price.x == Price.y and where Name.x != Name.y

> (foo <- foo[with(foo, which(Price.x == Price.y & Name.x != Name.y)),])
       Date Name.x Price.x Name.y Price.y Interest
2  01.02.10      X     120      R     120      0.2
6  01.02.10      R     120      X     120      0.2
15 01.02.10      X     130      R     130      0.2
23 01.02.10      R     130      X     130      0.2

Then, get rid of the superfluous columns:

> (foo <- foo[, -(4:5)])
       Date Name.x Price.x Interest
2  01.02.10      X     120      0.2
6  01.02.10      R     120      0.2
15 01.02.10      X     130      0.2
23 01.02.10      R     130      0.2

And finally, fix-up the column names:

> names(foo) <- names(dat)
> foo
       Date Name Price Interest
2  01.02.10    X   120      0.2
6  01.02.10    R   120      0.2
15 01.02.10    X   130      0.2
23 01.02.10    R   130      0.2

The second thing can be done using ifelse

with(dat, ifelse(Name == "X", Price + 100*Interest, Price))

Which gives something this

> with(dat, ifelse(Name == "X", Price + 100*Interest, Price))
[1] 140 120 150 160 130

The reason that the if() doesn't work, is that if() only take a scalar logical (a single TRUE or FALSE), yet Name == "X" returns a logical vector:

> with(dat, Name == "X")
[1]  TRUE FALSE  TRUE  TRUE FALSE

In these cases, ifelse() is your friend.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜