How to add a column in the data frame within a function
I have a data frame, and I want to do some calculation with existing columns and create new column in my data set which is a combination of 开发者_运维百科existing... I can do this easily outside function... but if I wrap the code witin function, the changes I made (inside functions) are not visible outside function... i.e. the new column doesn't exist...
I would appreciate sample code to do this...
I'll assume it is about R... R does not pass arguments by reference (environments and reference classes (S5) are an exception, but this is out of the current range of abstraction). Thus, when you write
addThree<-function(x){
x<-x+3
}
4->y
addThree(y)
y
is still 4 at the end of code, because inside the function, x
is the fresh copy of y
s value, not the y
itself (again, not exactly, but those are higher order details).
Thus, you must adapt to R's pass-by-copy scheme and return the altered value and assign it back to your variable (using old wording, there are no procedures in R):
addThree<-function(x){
return(x+3)
}
4->y
addThree(y)->y
#y is now 7
Don't worry, this works smoothly for even more complex objects because R is garbage-collected and has lazy evaluation.
BTW, you can omit return
if you want to return the last value produced in function, i.e. addThree
's definition may look like this:
addThree<-function(x) x+3
the best approach is to use mutate()
from dplyr library. Example:
addcol = function(dat){
dat1 = mutate(dat, x2=x1*2)
return(dat1)
}
dat is a data frame with a column named "x1". Use this function addcol()
, the new dataset now has a new column named "x2" which is twice the value of "x1", assuming x1
is numeric.
精彩评论