ARIMA, ARMA and AICs?
data <-c(88, 84, 85, 85, 84, 85, 83, 85, 88, 89, 91, 99, 104, 112, 126, 138, 146,151, 150, 148, 147, 149, 143, 132, 131, 139, 147, 150, 148, 145, 140, 134, 131, 131, 129, 126, 126, 132, 137, 140, 142, 150, 159, 167, 170, 171, 172, 172, 174, 175, 172, 172, 174, 174, 169, 165, 156, 142, 131, 121, 112, 104, 102, 99, 99, 95, 88, 84, 84, 87, 89, 88, 85, 86, 89, 91, 91, 94, 101, 110, 121, 135, 145, 149, 156, 165, 171, 175, 177, 182, 193, 204, 208, 210, 215, 222, 228, 226, 222, 220)
Why do the ARMA models acting on the first differences of the data differ from the corresponding ARIMA models?
for (p in 0:5)
{
for (q in 0:5)
{
#data.arma = arima(diff(data), order = c(p, 0, q));cat("p =", p, ", q =", q, "AIC =", data.arma$aic, "\n");
data.arma = arima(data, order = c(p, 1, q));cat("p =", p, ", q =", q, "AIC =", data.arma$aic, "\n");
}
}
Same with Arima(data,c(5,1,4))
and Arima(diff(data),c(5,0,4))
in the forecast package. I can get the desired consistency with
auto.arima(diff(data),max.p=5,max.q=5,d=0,approximation=FALSE, stepwise=FALSE, ic ="aic", trace=TRUE);
auto.arima(data,max.p=5,max.q=5,d=1,ap开发者_高级运维proximation=FALSE, stepwise=FALSE, ic ="aic", trace=TRUE);
but it seems the holder of the minimum AIC estimate for these data has not been considered by the algorithm behind auto.arima; hence the suboptimal choice of ARMA(3,0) instead of ARMA(5,4) acting on the first differences. A related question is how much the two AIC estimates should differ before one considers one model better than the other has little to do wuth programming - the smallest AIC holder should at least be considered/reported, even though 9 coefficients may be a bit too much for a forecast from 100 observations.
My R questions are:
1) Vectorised version of the double loop so it is faster?
2) Why does arima(5,1,4)
acting on the data differ from arma(5,4)
acting on the first differences of the data? Which one is to be reported?
3) How do I sort the AICs output so that the smaller come first?
Thanks.
There are a lot of questions and issues raised here. I'll try to respond to each of them.
Arima()
is just a wrapper for arima()
, so it will give the same model.
arima() handles a model with differencing by using a diffuse prior. That is not the same as just differencing the data before fitting the model. Consequently, you will get slightly different results from arima(x,order=c(p,1,q))
and arima(diff(x),order=c(p,0,q))
.
auto.arima()
handles differencing directly and does not use a diffuse prior when fitting. So you will get the same results from auto.arima(x,d=1,...)
and auto.arima(diff(x),d=0,...)
auto.arima()
has an argument max.order
which specifies the maximum of p+q. By default, max.order=5
, so your arima(5,1,4)
would not be considered. Increase max.order
if you want to consider such large models (although I wouldn't recommend it).
You can't vectorize a loop involving nonlinear optimization at each iteration.
If you want to sort your output, you'll need to save it to a data.frame and then sort on the relevant column. The code currently just spits out the results as it goes and nothing is saved except for the most recent model fitted.
精彩评论