Matching Columns, Creating Loop in R
I have t开发者_开发知识库he following question:
I have data frame which looks like this. I have prices, 3 X's and 2 R's.
Date Name Price Interest
01.02.10 X 120 0.2
01.02.10 R 120 0.3
01.02.10 X 130 0.8
01.02.10 X 140 0.4
01.02.10 R 130 0.2
etc.
I would like to tell R to look for pairs of X&Rs with the same price, and delete the rest. So this should result: 2 X's and 2'Rs (in this case).
Date Name Price Interest
01.02.10 X 120 0.2
01.02.10 R 120 0.3
01.02.10 X 130 0.8
01.02.10 R 130 0.2
etc.
To make it clearer (hopefully): I have a lot of different prices for each date. Each row either has an X or an R in it. There are a lot of pairs on each date, i.e. for example X, Price = 120 & R, Price = 120 on Date 1. But there are also Prices which only match one Name, for example there is a Price = 140 only for Name = X. So what i would like R to do is: check for machting Names for one Price (i.e. there exists the same Price for one X and one R) and delete the rest. What actually would result is the same number of X's and R's because I'm looking for pairs.
I'm sorry not to be able to post something I tried. I just couldn't think of anything.
Now, to the next problem: If the pairs are there, I would like to tell R to check each line. If the Name is X, I want it to calculate a new price, if not just print the existing price. I tried
xx <- if(Name == "X"){Price + 100*interest} else print{Price}
but it didn't work.
Thanks for help
Cheers Dani
Edit: @Dwin's comment to the Q was a bit cryptic, and seeing as my first attempt at part 1 of the Q was not correct due to the unclear Q, I'll try to redeem myself with a go at expanding on DWin's comment:
[Assuming dat
contains the data you quote in the Q.] First, merge dat
with itself:
> foo <- merge(dat[, -4], dat, by.x = "Date", by.y = "Date")
> head(foo)
Date Name.x Price.x Name.y Price.y Interest
1 01.02.10 X 120 X 120 0.2
2 01.02.10 X 120 R 120 0.2
3 01.02.10 X 120 X 130 0.2
4 01.02.10 X 120 X 140 0.2
5 01.02.10 X 120 R 130 0.2
6 01.02.10 R 120 X 120 0.2
Next, get out the rows where Price.x == Price.y
and where Name.x != Name.y
> (foo <- foo[with(foo, which(Price.x == Price.y & Name.x != Name.y)),])
Date Name.x Price.x Name.y Price.y Interest
2 01.02.10 X 120 R 120 0.2
6 01.02.10 R 120 X 120 0.2
15 01.02.10 X 130 R 130 0.2
23 01.02.10 R 130 X 130 0.2
Then, get rid of the superfluous columns:
> (foo <- foo[, -(4:5)])
Date Name.x Price.x Interest
2 01.02.10 X 120 0.2
6 01.02.10 R 120 0.2
15 01.02.10 X 130 0.2
23 01.02.10 R 130 0.2
And finally, fix-up the column names:
> names(foo) <- names(dat)
> foo
Date Name Price Interest
2 01.02.10 X 120 0.2
6 01.02.10 R 120 0.2
15 01.02.10 X 130 0.2
23 01.02.10 R 130 0.2
The second thing can be done using ifelse
with(dat, ifelse(Name == "X", Price + 100*Interest, Price))
Which gives something this
> with(dat, ifelse(Name == "X", Price + 100*Interest, Price))
[1] 140 120 150 160 130
The reason that the if()
doesn't work, is that if()
only take a scalar logical (a single TRUE
or FALSE
), yet Name == "X"
returns a logical vector:
> with(dat, Name == "X")
[1] TRUE FALSE TRUE TRUE FALSE
In these cases, ifelse()
is your friend.
精彩评论