Speeding up MCMC by Efficient Data Subsampling
Field | Value | Language |
dc.contributor.author | Quiroz, Matias | |
dc.contributor.author | Villani, Mattias | |
dc.contributor.author | Kohn, Robert | |
dc.contributor.author | Tran, Minh-Ngoc | |
dc.date.accessioned | 2017-01-19 | |
dc.date.available | 2017-01-19 | |
dc.date.issued | 2016-01-01 | |
dc.identifier.uri | http://hdl.handle.net/2123/16205 | |
dc.description.abstract | We propose Subsampling MCMC, a Markov Chain Monte Carlo (MCMC) framework where the likelihood function for n observations is estimated from a random subset of m observations. We introduce a general and highly efficient unbiased estimator of the log-likelihood based on control variates obtained from clustering the data. The cost of computing the log-likelihood estimator is much smaller than that of the full log-likelihood used by standard MCMC. The likelihood estimate is bias-corrected and used in two correlated pseudo-marginal algorithms to sample from a perturbed posterior, for which we derive the asymptotic error with respect to n and m, respectively. A practical estimator of the error is proposed and we show that the error is negligible even for a very small m in our applications. We demonstrate that Subsampling MCMC is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature. | en_AU |
dc.relation.ispartofseries | BAWP-2016-07 | en_AU |
dc.subject | Bayesian inference | en_AU |
dc.subject | Correlated pseudo-marginal | en_AU |
dc.subject | Estimated likelihood | en_AU |
dc.subject | Block pseudo-marginal | en_AU |
dc.subject | Big Data | en_AU |
dc.subject | Survey sampling | en_AU |
dc.title | Speeding up MCMC by Efficient Data Subsampling | en_AU |
dc.type.pubtype | Pre-print | en_AU |
Associated file/s
Associated collections