开发者

'x' must be a numeric vector: Error from data.frame of numbers

I am running a cor.test on two columns within a file/table.

tmp <- read.table(files_to_test[i], header=TRUE, sep="\t")
## Obtain Columns To Compare ##
colA <-tmp[compareA]
colB <-tmp[compareB]
# sctr = 'spearman cor.test result'
sctr <- cor.test(colA, colB, alternative="two.sided", method="spearman")

But I am getting this confounding error...

Error in cor.test.default(colA, colB, alternative = "two.sided", method = "spearman") : 
'x' m开发者_JAVA技巧ust be a numeric vector

the values in the columns ARE numbers but

is.numeric(colA) = FALSE 
class (colA) = data.frame

What have I missed?


Put a comma before your selector. When you select in a data.frame object with a single indexing variable without a comma it extracts a column as a list element retaining type. Therefore, it's still a data.frame. But, data.frame objects allow you to select using matrix style notation and then you would get a simple vector. So just change

colA <-tmp[compareA]
colB <-tmp[compareB]

to

colA <-tmp[,compareA]
colB <-tmp[,compareB]

I think this is more keeping with the spirit of the data.frame type than double brace ([[) selectors, which will do something similar but in the spirit of the underlying list type. They also are unrelated to individual item and row selectors. So, in code that's doing multiple kinds of things with the data.frame the double brace selectors stand out as a bit of an odd duck.


Try tmp[[compareA]] and tmp[[compareB]] instead of single brackets. You wanted to extract numeric vectors, what you did instead was to extract single-column data frames. Compare the following:

> z <- data.frame(a=1:5,b=1:5)
> str(z["a"])
'data.frame':   5 obs. of  1 variable:
 $ a: int  1 2 3 4 5
> is.numeric(z["a"])
[1] FALSE
> str(z[["a"]])
 int [1:5] 1 2 3 4 5
> is.numeric(z[["a"]])
[1] TRUE

Try these out with cor.test:

Single brackets: error as above.

> cor.test(z["a"],z["b"])
Error in cor.test.default(z["a"], z["b"]) : 'x' must be a numeric vector

Double brackets: works.

> cor.test(z[["a"]],z[["b"]])

    Pearson's product-moment correlation

data:  z[["a"]] and z[["b"]] 
[snip snip snip]

As @Aaron points out below, cor will handle single-column data frames fine, by converting them to matrices -- but cor.test doesn't. (This could be brought up on r-devel@r-project.org , or ?? submitted to the R bug tracker as a a wish list item ...)

See also: Numeric Column in data.frame returning "num" with str() but not is.numeric() , What's the biggest R-gotcha you've run across? (maybe others)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜