trouble with rpy2, rpart passing data correctly from python to r

2023-01-30 04:44 问答作者：

I am trying to run rpart through RPY2 using Python 2.6.5 and R 10.0.

I create a data frame in python and pass it along but I get an error stating:

Error in function (x)  : binary operation on non-conformable arrays
Traceback (most recent call last):
  File "partitioningSANDBOX.py", line 86, in <module>
    model=r.rpart(**rpart_params)
  File "build/bdist.macosx-10.3-fat/egg/rpy2/robjects/functions.py", line 83, in __call__
  File "build/bdist.macosx-10.3-fat/egg/rpy2/robjects/functions.py", line 35, in __call__
rpy2.rinterface.RRuntimeError: Error in function (x)  : binary operation on non-conformable arrays

Can anyone help me determine what I am doing wrong to throw this error?

the relevant part of my code is this:

import numpy as np
import rpy2
import rpy2.robjects as rob
import rpy2.robjects.numpy2ri


#Fire up the interface to R
r = rob.r
r.library("rpart")

datadict = dict(zip(['responsev','predictorv'],[cLogEC,csplitData]))
Rdata = r['data.frame'](**datadict)
Rformula = r['as.formula']('responsev ~.')
#Generate an RPART model in R.
Rpcontrol = r['rpart.control'](minsp开发者_如何学Golit=10, xval=10)
rpart_params = {'formula' : Rformula, \
       'data' : Rdata,
       'control' : Rpcontrol}
model=r.rpart(**rpart_params)

The two variables cLogEC and csplitData are numpy arrays of float type.

Also, my data frame looks like this:

In [2]: print Rdata
------> print(Rdata)
   responsev predictorv
1  0.6020600        312
2  0.3010300        300
3  0.4771213        303
4  0.4771213        249
5  0.9242793        239
6  1.1986571        297
7  0.7075702        287
8  1.8115750        270
9  0.6020600        296
10 1.3856063        248
11 0.6127839        295
12 0.3010300        283
13 1.1931246        345
14 0.3010300        270
15 0.3010300        251
16 0.3010300        246
17 0.3010300        273
18 0.7075702        252
19 0.4771213        252
20 0.9294189        223
21 0.6127839        252
22 0.7075702        267
23 0.9294189        252
24 0.3010300        378
25 0.3010300        282

and the formula looks like this:

In [3]: print Rformula
------> print(Rformula)
responsev ~ .

The problem is related to R idiosyncratic code in rpart (to be precise, the following block, in particular the last line:

m <- match.call(expand.dots = FALSE)
m$model <- m$method <- m$control <- NULL
m$x <- m$y <- m$parms <- m$... <- NULL
m$cost <- NULL
m$na.action <- na.action
m[[1L]] <- as.name("model.frame")
m <- eval(m, parent.frame())

One way to work around that is to avoid entering that block of code (see below) or may be to construct a nested evaluation from Python (so that parent.frame() behaves). This is not as simple as one would hope, but may be I'll find time to make it easier in the future.

from rpy2.robjects import DataFrame, Formula
import rpy2.robjects.numpy2ri as npr
import numpy as np
from rpy2.robjects.packages import importr
rpart = importr('rpart')
stats = importr('stats')

cLogEC = np.random.uniform(size=10)
csplitData = np.array(range(10), 'i')

dataf = DataFrame({'responsev': cLogEC,
                   'predictorv': csplitData})
formula = Formula('responsev ~.')
rpart.rpart(formula=formula, data=dataf, 
            control=rpart.rpart_control(minsplit = 10, xval = 10),
            model = stats.model_frame(formula, data=dataf))

继续阅读：python r rpy2

trouble with rpy2, rpart passing data correctly from python to r

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？