Numerically Stable Approximate Bayesian Methods for Generalized Linear Mixed Models and Linear Model Selection
Access status:
Open Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Greenaway, Mark JonathanAbstract
Approximate Bayesian inference methods offer methodology for fitting Bayesian models as fast alternatives to Markov Chain Monte Carlo methods that sometimes have only a slight loss of accuracy. In this thesis, we consider variable selection for linear models, and zero inflated mixed ...
See moreApproximate Bayesian inference methods offer methodology for fitting Bayesian models as fast alternatives to Markov Chain Monte Carlo methods that sometimes have only a slight loss of accuracy. In this thesis, we consider variable selection for linear models, and zero inflated mixed models. Variable selection for linear regression models are ubiquitous in applied statistics. We use the popular g-prior (Zellner, 1986) for model selection of linear models with normal priors where g is a prior hyperparameter. We derive exact expressions for the model selection Bayes Factors in terms of special functions depending on the sample size, number of covariates and R-squared of the model. We show that these expressions are accurate, fast to evaluate, and numerically stable. An R package blma for doing Bayesian linear model averaging using these exact expressions has been released on GitHub. We extend the Particle EM method of (Rockova, 2017) using Particle Variational Approximation and the exact posterior marginal likelihood expressions to derive a computationally efficient algorithm for model selection on data sets with many covariates. Our algorithm performs well relative to existing algorithms, completing in 8 seconds on a model selection problem with a sample size of 600 and 7200 covariates. We consider zero-inflated models that have many applications in areas such as manufacturing and public health, but pose numerical issues when fitting them to data. We apply a variational approximation to zero-inflated Poisson mixed models with Gaussian distributed random effects using a combination of VB and the Gaussian Variational Approximation (GVA). We also incorporate a novel parameterisation of the covariance of the GVA using the Cholesky factor of the precision matrix, similar to Tan and Nott (2018) to resolve associated numerical difficulties.
See less
See moreApproximate Bayesian inference methods offer methodology for fitting Bayesian models as fast alternatives to Markov Chain Monte Carlo methods that sometimes have only a slight loss of accuracy. In this thesis, we consider variable selection for linear models, and zero inflated mixed models. Variable selection for linear regression models are ubiquitous in applied statistics. We use the popular g-prior (Zellner, 1986) for model selection of linear models with normal priors where g is a prior hyperparameter. We derive exact expressions for the model selection Bayes Factors in terms of special functions depending on the sample size, number of covariates and R-squared of the model. We show that these expressions are accurate, fast to evaluate, and numerically stable. An R package blma for doing Bayesian linear model averaging using these exact expressions has been released on GitHub. We extend the Particle EM method of (Rockova, 2017) using Particle Variational Approximation and the exact posterior marginal likelihood expressions to derive a computationally efficient algorithm for model selection on data sets with many covariates. Our algorithm performs well relative to existing algorithms, completing in 8 seconds on a model selection problem with a sample size of 600 and 7200 covariates. We consider zero-inflated models that have many applications in areas such as manufacturing and public health, but pose numerical issues when fitting them to data. We apply a variational approximation to zero-inflated Poisson mixed models with Gaussian distributed random effects using a combination of VB and the Gaussian Variational Approximation (GVA). We also incorporate a novel parameterisation of the covariance of the GVA using the Cholesky factor of the precision matrix, similar to Tan and Nott (2018) to resolve associated numerical difficulties.
See less
Date
2019-04-02Licence
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Science, School of Mathematics and StatisticsAwarding institution
The University of SydneyShare