开发者

ANCOVA in Python with Scipy/Numpy stats

I would like to know a way of performing ANCOVA(analysis of covariance) using Python with scipy. It is basically a statistical comparison of regression lines. I know Python can do ANOVA and it can also do r开发者_运维技巧egression line fitting with Scipy.stats. I'm not sure how to put those together to get an effective ANCOVA though, if it is possible.


ANCOVA can be done with regression an using dummy variables in the design matrix for the effects that depend on the categorical variable.

A simple example is at http://groups.google.com/group/pystatsmodels/browse_thread/thread/aaa31b08f3df1a69?hl=en using the OLS class from scikits.statsmodels

Relevant part of construction of design matrix xg includes group numbers/labels, x1 is continuous explanatory variable

>>> dummy = (xg[:,None] == np.unique(xg)).astype(float)
>>> X = np.c_[x1, dummy[:,1:], np.ones(nsample)]

Estimate the model

>>> res2 = sm.OLS(y, X).fit()
>>> print res2.params
[ 1.00901524  3.08466166 -2.84716135  9.94655423]
>>> print res2.bse
[ 0.07499873  0.71217506  1.16037215  0.38826843]
>>> prstd, iv_l, iv_u = wls_prediction_std(res2)

"Test hypothesis that all groups have same intercept"

>>> R = [[0, 1, 0, 0],
...      [0, 0, 1, 0]]

>>> print res2.f_test(R)
<F test: F=array([[ 91.69986847]]), p=[[  8.90826383e-17]],
df_denom=46, df_num=2>

strongly rejected because differences in intercept are very large

Update (two and a half years later):

scikits.statsmodels has been renamed to statsmodels

and to the question:

With the latest release of statsmodels, it is more convenient to use formulas for specifying categorical effects and interaction effects. statsmodels uses patsy to handle the formulas and creates the design matrices.

More information is available at the links to the statsmodels documentation in https://stackoverflow.com/a/19495920/333700

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜