Editing cell entries in a variable in a data frame inside a list of data frames
Define:
> dats <- list( df1 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))),
+ df2 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))))
> dats
$df1
a b
1 3 325.049072M
2 2 325.049072M
3 1 325.049072M
$df2
a b
1 2 325.049072M
2 1 325.049072M
3 3 325.049072M
I want to remove the M character from column b in each data frame.
In a simple framework:
> t<-c("325.049072M","325.049072M")
> t
[1] "325.049072M" "325.049072M"
> t <- substr(t, 1, nchar(t)-1)
> t
[1] "325.049072" "325.049072"
But in a nested one, how to proceed? Here is one sorry attempt:
> dats <- list( df1 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))),
+ df2 = data.frame(a=sample(1:3), b = as.factor(rep("325.049072M",3))))
> dats
$df1
a b
1 3 325.049072M
2 1 325.049072M
3 2 325.049072M
$df2
a b
1 2 325.049072M
2 3 325.049072M
3 1 325.049072M
> for(i in seq(along=dats)) {
+ dats[[i]]["b"] <-
+ substr(dats[[i]]["b"], 1, nchar(dats[[i]]["b"])-1)
+ }
> dats
$df1
a b
1 3 c(1, 1, 1
2 1 c(1, 1, 1
3 2 c(1, 1, 1
$d开发者_C百科f2
a b
1 2 c(1, 1, 1
2 3 c(1, 1, 1
3 1 c(1, 1, 1
You can do this with lapply
(and some coercion):
stripM <- function(x){
x$b <- substr(as.character(x$b),1,nchar(as.character(x$b))-1)
x
}
lapply(dats,FUN=stripM)
If you need that variable as a factor, you can include a line in stripM
that converts is back to a factor, something like x$b <- as.factor(x$b)
.
Try using gsub
instead of substr
- something like this:
lapply(<data.frame or list>, function(x) as.numeric(gsub("M$", "", x)))
of course, you need to figure out how are you going to recurse into list elements etc. but I guess you get the picture...
Ok, here is another possibility, not neat, but intelligible:
for(i in seq(along=dats)) {
c <- as.character(dats[[i]][["b"]])
c <- substr(c, 1, nchar(c)-1)
dats[[i]][["b"]] <- c
dats
}
dats
I have to say that I find the whole [[
versus [
referencing very cryptic.
> str(dats[[i]][["b"]])
chr [1:3] "325.049072" "325.049072" "325.049072"
> str(dats[[i]]["b"])
'data.frame': 3 obs. of 1 variable:
$ b: chr "325.049072" "325.049072" "325.049072"
I proceed by trial and error. Any pointers to a good explanation?
精彩评论