Longitudinal count data, following a Poisson distribution, can be analyzed with Generalized Linear Mixed Models (GLMM) or with GEE. I won't get into the computational or philosophical differences between conditional, subject-specific estimates associated with GLMM and marginal, population-level estimates obtained by GEE in this post. However, if you decide that GEE is right for you (I have a paper in preparation comparing GLMM and GEE), you may also want to compare multiple GEE models. Unlike GLMM, GEE does not use full likelihood estimates, but rather, relies on a quasi-likelihood function. Therefore, the popular AIC approach to model selection don't apply to GEE models. Luckily, Pan (2001) developed an equivalent QIC for model comparison. Like AIC, it balances the model fit with model complexity to pick the most parsimonious model.
Unfortunately, there is currently no QIC package in R for GEE models. geepack is a popular R package for GEE analysis. So, I wrote the short R script below to calculate Pan's QIC statistic from the output of a GEE model run in geepack using the geese function. It currently employs the Moore-Penrose Generalized Matrix Inverse through the MASS package. I left in my original code using the identity matrix but it is preceded by a pound sign so it doesn't run. [edition: April 10, 2012] The input for the QIC function needs to come from the geeglm function (as opposed to "geese") within geepack.
I hope you find it useful. I'm still fairly new to R and this is one of my first custom functions, so let me know if you have problems using it or if there are places it can be improved. If you decide to use this for analysis in a publication, please let me know just for my own curiosity (and ego boost!).
#####################################################################################
# QIC for GEE models
# Daniel J. Hocking
# 07 February 2012
# Refs:
# Pan (2001)
# Liang and Zeger (1986)
# Zeger and Liang (1986)
# Hardin and Hilbe (2003)
# Dornmann et al 2007
# # http://www.unc.edu/courses/2010spring/ecol/562/001/docs/lectures/lecture14.htm
#####################################################################################
# Poisson QIC for geese{geepack} output
# Ref: Pan (2001)
QIC.pois.geeglm <- function(model.R, model.indep) {
library(MASS)
# Fitted and observed values for quasi likelihood
mu.R <- model.R$fitted.values
# alt: X <- model.matrix(model.R)
# names(model.R$coefficients) <- NULL
# beta.R <- model.R$coefficients
# mu.R <- exp(X %*% beta.R)
y <- model.R$y
# Quasi Likelihood for Poisson
quasi.R <- sum((y*log(mu.R)) - mu.R) # poisson()$dev.resids - scale and weights = 1
# Trace Term (penalty for model complexity)
AIinverse <- ginv(model.Indep$vbeta.naiv) # Omega-hat(I) via Moore-Penrose
generalized inverse of a matrix in MASS package
# Alt: AIinverse <- solve(model.Indep$vbeta.naiv) # solve via identity
Vr <- model.R$vbeta
trace.R <- sum(diag(AIinverse %*% Vr))
px <- length(mu.R) # number non-redunant columns in design matrix
# QIC
QIC <- (-2)*quasi.R + 2*trace.R
QICu <- (-2)*quasi.R + 2*px # Approximation assuming model structured correctly
output <- c(QIC, QICu, quasi.R, trace.R, px)
names(output) <- c('QIC', 'QICu', 'Quasi Lik', 'Trace', 'px')
output}
This comment has been removed by the author.
ReplyDelete