R problems with filling a vector using a for-loop
I'm iterating over a vector, for each element I look something up in a table by rowname and copy the return into a different vector. The following code is used for that
gs1 = function(p)
{
output <- character() #empty vector to which results will be forwarded
for (i in 1:length(p)) {
test <- p[i]
index <- which(rownames(conditions) == test)
toappend <- conditions[index,3] #working
output[i] <- toappend
print(paste(p[i],index,toappend,output[i]))
}
return(output)
}
All it spits out is a vector with numbers....while all other variables seems to contain the correct information开发者_StackOverflow (as checked by the print function) I have the feeling I'm doing something terribly wrong in filling the output vector... I could also use
output <- c(output,toappend)
But that gives me exactly the same, wrong and strange output.
All help is very much appreciated!
Output example
> gs1 = function(p)
+ {
+ output <- character() #empty vector to which results will be pasted
+
+ for (i in 1:length(p)) {
+ test <- p[i]
+ index <- which(rownames(conditions) == test)
+ toappend <- conditions[index,3] #working
+
+ output <- c(output,toappend)
+ output[i] <- toappend
+ print(paste(p[i],index,toappend,output[i],sep=","))
+ }
+ return(output)
+ }
> ###########################
> test <- colnames(tri.data.1)
> gs1(test)
[1] "Row.names,,,NA"
[1] "GSM235482,1,Glc A,5"
[1] "GSM235484,2,Glc A,5"
[1] "GSM235485,3,Glc A,5"
[1] "GSM235487,4,Xyl A,21"
[1] "GSM235489,5,Xyl A,21"
[1] "GSM235491,6,Xyl A,21"
[1] "GSM297399,7,pH 2.5,12"
[1] "GSM297400,8,pH 2.5,12"
[1] "GSM297401,9,pH 2.5,12"
[1] "GSM297402,10,pH 4.5,13"
[1] "GSM297403,11,pH 4.5,13"
[1] "GSM297404,12,pH 4.5,13"
[1] "GSM297563,13,pH 6.0,14"
[1] "GSM297564,14,pH 6.0,14"
[1] "GSM297565,15,pH 6.0,14"
[1] "5" "5" "5" "5" "21" "21" "21" "12" "12" "12" "13" "13" "13" "14" "14" "14"
Very likely you're using a data frame and not a table, and as likely your third column is not a character vector but a factor. And there is no need to write that function, you could easily obtain the wanted by:
conditions[X,3]
with X being a character vector of row names. eg :
X <- data.frame(
var1 = 1:10,
var2 = 10:1,
var3 = letters[1:10],
row.names=LETTERS[1:10]
)
> test <- c("F","D","A")
> X[test,3]
[1] f d a
Levels: a b c d e f g h i j
To get it in characters:
> as.character(X[test,3])
[1] "f" "d" "a"
[Joris' comments suggest I was too cryptic, so some additional explanation]:
Effectively, if we ignore the processing in your loop, this is what you have:
> p <- 1:10
> gs1 <- function(p) {
+ output <- character()
+ for(i in seq_along(p)) {
+ output[i] <- p[i] * 10
+ print(output)
+ }
+ return(output)
+ }
> foo <- gs1(p)
[1] "10"
[1] "10" "20"
[1] "10" "20" "30"
[1] "10" "20" "30" "40"
[1] "10" "20" "30" "40" "50"
[1] "10" "20" "30" "40" "50" "60"
[1] "10" "20" "30" "40" "50" "60" "70"
[1] "10" "20" "30" "40" "50" "60" "70" "80"
[1] "10" "20" "30" "40" "50" "60" "70" "80" "90"
[1] "10" "20" "30" "40" "50" "60" "70" "80" "90" "100"
> foo
[1] "10" "20" "30" "40" "50" "60" "70" "80" "90" "100"
So gs1
is returning something, and output
is being filled, as long as toappend
is acharacter or can be coerced to character to go into output
. Now, if toappend
is not what you think it is, then that is where you will start to get problems.
I see two potential problems; i) toappend
is actually a factor (which is something Joris mentions too) and you are getting the numerical equivalent of the internal coding for that level. In which case
ouput[i] <- as.character(toappend)
should suffice, or ii) index
is greater than length 1 and you are getting more elements in the vector that you expect and thus at the next iteration you are overwriting them.
Are you sure toappend
is a single character vector of length 1? How about you show us the incorrect output (edit your Question and add the output from the function) and tell us why it is wrong!
Of course, this can all be simplified to conditions[p, 3]
and no need for a loop but I assume your actual functions is more complex?
Note on setting up loops
As for loops in general, you make the mistake of not preallocating storage. You shouldn't do things the way you are. Notice how at each iteration R is having to grow output
by one element per iteration. The same would be true of your output <- c(output, toappend)
idiom. This involves lots of redundant copying of the vector which bogs loops down. Instead, allocate enough storage up front and fill output
as you are doing. E.g.:
gs2 <- function(p) {
output <- character(length = length(p))
for(i in seq_along(p)) {
output[i] <- p[i] * 10
print(output)
}
return(output)
}
which produces this output:
> gs2(p)
[1] "10" "" "" "" "" "" "" "" "" ""
[1] "10" "20" "" "" "" "" "" "" "" ""
[1] "10" "20" "30" "" "" "" "" "" "" ""
[1] "10" "20" "30" "40" "" "" "" "" "" ""
[1] "10" "20" "30" "40" "50" "" "" "" "" ""
[1] "10" "20" "30" "40" "50" "60" "" "" "" ""
[1] "10" "20" "30" "40" "50" "60" "70" "" "" ""
[1] "10" "20" "30" "40" "50" "60" "70" "80" "" ""
[1] "10" "20" "30" "40" "50" "60" "70" "80" "90" ""
[1] "10" "20" "30" "40" "50" "60" "70" "80" "90" "100"
[1] "10" "20" "30" "40" "50" "60" "70" "80" "90" "100"
The duplicated last line is due to auto-printing of the object (output
) returned from the function.
精彩评论