Creating function arguments from a named list (with an application to stats4::mle)
I should start by saying what I'm trying to do: I want to 开发者_运维知识库use the mle function without having to re-write my log likelihood function each time I want to try a different model specification. Because mle is expecting a named list of starting values, you apparently cannot just write the log-likelihood function as taking a vector of parameters. A simple example:
Suppose I want to fit a linear regression model via maximum likelihood and at first, I'm ignoring one of my predictors:
n <- 100
df <- data.frame(x1 = runif(n), x2 = runif(n), y = runif(n))
Y <- df$y
X <- model.matrix(lm(y ~ x1, data = df))
# define log-likelihood function
ll <- function(beta0, beta1, sigma){
beta = matrix(NA, nrow=2, ncol=1)
beta[,1] = c(beta0, beta1)
-sum(log(dnorm(Y - X %*% beta, 0, sigma)))
}
library(stats4)
mle(ll, start = list(beta0=.1, beta1=.2, sigma=1)
Now, if I want to fit a different model, say:
m <- lm(y ~ x1 + x2, data = df)
I cannot re-use my log-likelihood function--I'd have to re-write it to have the beta3 parameter. What I'd like to do is something like:
ll.flex <- function(theta){
# theta is a vector that I can use directly
...
}
if I could then somehow adjust the start argument in mle to account for my now vector-input log-likelihood function, or barring that, have a function that constructs the log-likelihood function at run-time, say by constructing the named list of arguments and then using it to define the function e.g., something like this:
X <- model.matrix(lm(y ~ x1 + x2, data = df))
arguments <- rep(NA, dim(X)[2])
names(arguments) <- colnames(X)
ll.magic <- function(bring.this.to.life.as.function.arguments(arguments)){...}
Update:
I ended up writing a helper function that can add an arbitrary number of named arguments x1, x2, x3... to a passed function f.
add.arguments <- function(f,n){
# adds n arguments to a function f; returns that new function
t = paste("arg <- alist(",
paste(sapply(1:n, function(i) paste("x",i, "=",sep="")), collapse=","),
")", sep="")
formals(f) <- eval(parse(text=t))
f
}
It's ugly, but it got the job done, letting me re-factor my log-likelihood function on the fly.
You can use the mle2
function from the package bbmle
which allows you to pass vectors as parameters. Here is some sample code.
# REDEFINE LOG LIKELIHOOD
ll2 = function(params){
beta = matrix(NA, nrow = length(params) - 1, ncol = 1)
beta[,1] = params[-length(params)]
sigma = params[[length(params)]]
minusll = -sum(log(dnorm(Y - X %*% beta, 0, sigma)))
return(minusll)
}
# REGRESS Y ON X1
X <- model.matrix(lm(y ~ x1, data = df))
mle2(ll2, start = c(beta0 = 0.1, beta1 = 0.2, sigma = 1),
vecpar = TRUE, parnames = c('beta0', 'beta1', 'sigma'))
# REGRESS Y ON X1 + X2
X <- model.matrix(lm(y ~ x1 + x2, data = df))
mle2(ll2, start = c(beta0 = 0.1, beta1 = 0.2, beta2 = 0.1, sigma = 1),
vecpar = TRUE, parnames = c('beta0', 'beta1', 'beta2', 'sigma'))
This gives you
Call:
mle2(minuslogl = ll2, start = c(beta0 = 0.1, beta1 = 0.2, beta2 = 0.1,
sigma = 1), vecpar = TRUE, parnames = c("beta0", "beta1",
"beta2", "sigma"))
Coefficients:
beta0 beta1 beta2 sigma
0.5526946 -0.2374106 0.1277266 0.2861055
It might be easier to use optim
directly; that's what mle
is using anyway.
ll2 <- function(par, X, Y){
beta <- matrix(c(par[-1]), ncol=1)
-sum(log(dnorm(Y - X %*% beta, 0, par[1])))
}
getp <- function(X, sigma=1, beta=0.1) {
p <- c(sigma, rep(beta, ncol(X)))
names(p) <- c("sigma", paste("beta", 0:(ncol(X)-1), sep=""))
p
}
set.seed(5)
n <- 100
df <- data.frame(x1 = runif(n), x2 = runif(n), y = runif(n))
Y <- df$y
X1 <- model.matrix(y ~ x1, data = df)
X2 <- model.matrix(y ~ x1 + x2, data = df)
optim(getp(X1), ll2, X=X1, Y=Y)$par
optim(getp(X2), ll2, X=X2, Y=Y)$par
With the output of
> optim(getp(X1), ll2, X=X1, Y=Y)$par
sigma beta0 beta1
0.30506139 0.47607747 -0.04478441
> optim(getp(X2), ll2, X=X2, Y=Y)$par
sigma beta0 beta1 beta2
0.30114079 0.39452726 -0.06418481 0.17950760
It might not be what you're looking for, but I would do this as follows:
mle2(y ~ dnorm(mu, sigma),parameters=list(mu~x1 + x2), data = df,
start = list(mu = 1,sigma = 1))
mle2(y ~ dnorm(mu,sigma), parameters = list(mu ~ x1), data = df,
start = list(mu=1,sigma=1))
You might be able to adapt this formulation for a multinomial, although dmultinom
might not work -- you might need to write a Dmultinom()
that took a matrix of multinomial samples and returned a (log)probability.
The R code that Ramnath provided can also be applied to the optim function because it takes vectors as parameters also.
精彩评论