Variational Inference in Generalised Hyperbolic and von Mises-Fisher Distributions
Access status:
USyd Access
Type
ThesisThesis type
Doctor of PhilosophyAuthor/s
Abeywardana, SachinthakaAbstract
Most real world data are skewed, contain more than the set of real numbers, and have higher probabilities of extreme events occurring compared to a normal distribution. In this thesis we explore two non-Gaussian distributions, the Generalised Hyperbolic Distribution (GHD) and, the ...
See moreMost real world data are skewed, contain more than the set of real numbers, and have higher probabilities of extreme events occurring compared to a normal distribution. In this thesis we explore two non-Gaussian distributions, the Generalised Hyperbolic Distribution (GHD) and, the von-Mises Fisher (vMF) Distribution. These distributions are studied in the context of 1) Regression in heavy tailed data, 2) Quantifying variance of functions with reference to finding relevant quantiles and, 3) Clustering data that lie on the surface of the sphere. Firstly, we extend Gaussian Processes (GPs) and use the Genralised Hyperbolic Processes as a prior on functions instead. This prior is more flexible than GPs and is especially able to model data that has high kurtosis. The method is based on placing a Generalised Inverse Gaussian prior over the signal variance, which yields a scalar mixture of GPs. We show how to perform inference efficiently for the predictive mean and variance, and use a variational EM method for learning. Secondly, the skewed extension of the GHD is studied with respect to quantile regression. An underlying GP prior on the quantile function is used to make the inference non-parametric, while the skewed GHD is used as the data likelihood. The skewed GHD has a single parameter alpha which states the required quantile. Similar variational methods as the first contribution is used to perform inference. Finally, vMF distributions are introduced in order to cluster spherical data. In the two previous contributions continuous scalar mixtures of Gaussians were used to make the inference process simpler. However, for clustering, a discrete number of vMF distributions are typically used. We propose a Dirichlet Process (DP) to infer the number of clusters in the spherical data setup. The framework is extended to incorporate a nested and a temporal clustering architecture. Throughout this thesis in many cases the posterior cannot be calculated in closed form. Variational Bayesian approximations are derived in this situation for efficient inference. In certain cases further lower bounding of the optimisation function is required in order to perform Variational Bayes. These bounds themselves are novel.
See less
See moreMost real world data are skewed, contain more than the set of real numbers, and have higher probabilities of extreme events occurring compared to a normal distribution. In this thesis we explore two non-Gaussian distributions, the Generalised Hyperbolic Distribution (GHD) and, the von-Mises Fisher (vMF) Distribution. These distributions are studied in the context of 1) Regression in heavy tailed data, 2) Quantifying variance of functions with reference to finding relevant quantiles and, 3) Clustering data that lie on the surface of the sphere. Firstly, we extend Gaussian Processes (GPs) and use the Genralised Hyperbolic Processes as a prior on functions instead. This prior is more flexible than GPs and is especially able to model data that has high kurtosis. The method is based on placing a Generalised Inverse Gaussian prior over the signal variance, which yields a scalar mixture of GPs. We show how to perform inference efficiently for the predictive mean and variance, and use a variational EM method for learning. Secondly, the skewed extension of the GHD is studied with respect to quantile regression. An underlying GP prior on the quantile function is used to make the inference non-parametric, while the skewed GHD is used as the data likelihood. The skewed GHD has a single parameter alpha which states the required quantile. Similar variational methods as the first contribution is used to perform inference. Finally, vMF distributions are introduced in order to cluster spherical data. In the two previous contributions continuous scalar mixtures of Gaussians were used to make the inference process simpler. However, for clustering, a discrete number of vMF distributions are typically used. We propose a Dirichlet Process (DP) to infer the number of clusters in the spherical data setup. The framework is extended to incorporate a nested and a temporal clustering architecture. Throughout this thesis in many cases the posterior cannot be calculated in closed form. Variational Bayesian approximations are derived in this situation for efficient inference. In certain cases further lower bounding of the optimisation function is required in order to perform Variational Bayes. These bounds themselves are novel.
See less
Date
2015-12-17Licence
The author retains copyright of this thesis. It may only be used for the purposes of research and study. It must not be used for any other purposes and may not be transmitted or shared with others without prior permission.Faculty/School
Faculty of Engineering and Information Technologies, School of Information TechnologiesAwarding institution
The University of SydneyShare