开发者

Predict values from sinusoidal noise

Background

Using R to predict the next values in a series.

Problem

The following code generates and plots a model for a curve with some uniform noise:

slope = 0.55
offset = -0.5
amplitude = 0.22
frequency = 3
noise = 0.75
x <- seq( 0, 200 )
y <- offset + (slope * x / 100) + (amplitude * sin( frequency * x / 100 ))
yn <- y + (noise * runif( length( x ) ))

gam.object <- gam( yn ~ s( x ) + 0 )
plot( gam.object, col = rgb( 1.0, 0.392, 0.0 ) )
points( x, yn, col = rgb( 0.121, 0.247, 0.506 ) )

The model reveals the trend, as expected. The trouble is predicting subsequent values:

p <- predict( gam.object, data.frame( x=201:210 ) )

The predictions do not look correct when plotted:

df <- data.frame( fit=c( fitted( gam.object ), p ) )
plot( seq( 1:211 ), df[,], col="blue" )
points( yn, col="orange" )

The predicted values (from 201 onwards) appear to be too low.

Q开发者_如何学Cuestions

  1. Are the predicted values, as shown, actually the most accurate predictions?
  2. If not, how can the accuracy be improved?
  3. What is a better way to concatenate the two data sets (fitted.values( gam.object ) and p)?


  1. The simulated data is weird, because all the errors you add to the "true" y are greater than 0. (runif creates numbers on [0,1], not [-1,1].)
  2. The problem disappears when an intercept term in the model is allowed.

For example:

gam.object2 <- gam( yn ~ s( x ))
p2 <- predict( gam.object2, data.frame( x=201:210 ))
points( 1:211, c( fitted( gam.object2 ), p2), col="green")

The reason for the systematic underestimation in the model without intercept could be that gam uses a sum-to-zero constraint on the estimated smooth functions. I think point 2 answers your first and second questions.

Your third question needs clarification because a gam-object is not a data.frame. The two data types do not mix.

A more complete example:

slope = 0.55
amplitude = 0.22
frequency = 3
noise = 0.75
x <- 1:200
y <- (slope * x / 100) + (amplitude * sin( frequency * x / 100 ))
ynoise <- y + (noise * runif( length( x ) ))

gam.object <- gam( ynoise ~ s( x ) )
p <- predict( gam.object, data.frame( x = 1:210 ) )

plot( p, col = rgb( 0, 0.75, 0.2 ) )
points( x, ynoise, col = rgb( 0.121, 0.247, 0.506 ) )
points( fitted( gam.object ), col = rgb( 1.0, 0.392, 0.0 ) )
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜