The fitting of complex statistical models that consists of latent or nuisance variables, in addition to various parameters to be estimated, likely involves overcoming an intractable integral. For instance, calculating the likelihood of such models require marginalising over the latent variables, and this may prove to be difficult computationally, either due to dimensionality or model design. Variational inference, or variational Bayes as it is also known, offers an efficient alternative to Markov chain Monte Carlo methods, the Laplace approximation, and quadrature methods. Rooted in Bayesian inference and popularised in machine learning, the main idea is to overcome the difficulties faced by working with “easy” density functions in lieu of the true posterior distribution. The approximating density function is chosen so as to minimise the (reverse) Kullback-Leilber divergence between them. The topics that will be discussed are mean-field distributions, the coordinate ascent algorithm, and approximation properties, with an example following. The hope is that the audience will gain a basic understanding of the method to possibly spur on further research and applications in their respective work.

In a regression setting, we define an I-prior as a Gaussian process prior on the regression function with covariance kernel equal to its Fisher information. We present some methodology and computational work on estimating regression functions by working in the appropriate reproducing kernel Hilbert space of functions and assuming an I-prior on the function of interest. In a regression model with normally distributed errors, estimation is simple—maximum likelihood and the EM algorithm is employed. In the classification models (categorical response models), estimation is performed using variational inference. I-prior models perform comparatively well, and often better, to similar leading state-of-the-art models for use in prediction and inference. Applications are plentiful, including smoothing models, modelling multilevel data, longitudinal data, functional covariates, multi-class classification, and even spatiotemporal modelling.

Estimation of complex models that consists of latent variables and various parameters, in addition to the data that is observed, might involve overcoming an intractable integral. For instance, calculating the likelihood of such models require marginalising over the latent variables, and this may prove to be difficult computationally—either due to model design or dimensionality. Variational inference, or variational Bayes as it is also known, offers an efficient alternative to Markov chain Monte Carlo methods, the Laplace approximation, and quadrature methods. Rooted in Bayesian inference and popularised in machine learning, the main idea is to overcome the difficulties faced by working with “easy” density functions in lieu of the true posterior distribution. The approximating density function is chosen so as to minimise the (reverse) Kullback-Leilber divergence between them. The topics that will be discussed are mean-field distributions, the coordinate ascent algorithm, and its properties, with examples following. The hope is that the audience will gain a basic understanding of the method to possibly spur on further research and applications in their respective work.

This is an overview of a unified methodology for fitting parametric and nonparametric regression models, including additive models, multilevel models, and models with one or more functional covariates. We also discuss an associated R-package called iprior. An I-prior is an objective prior for the regression function, and is based on its Fisher information. The regression function is estimated by its posterior mean under the I-prior, and scale parameters are estimated via maximum marginal likelihood using an Expectation-Maximization (EM) algorithm. Regression modelling using I-priors has several attractive features: it requires no assumptions other than those pertaining to the model of interest; estimation and inference is relatively straightforward; and small and large sample performance can be better than Tikhonov regularization. We illustrate the use of the iprior package by analysing three well- known data sets, in particular, a multilevel data set, a longitudinal data set, and a dataset involving a functional covariate.

An extension of the I-prior methodology to binary response data is explored. Starting from a latent variable approach, it is assumed that there exists continuous, auxiliary random variables which decide the outcome of the binary responses. Fitting a classical linear regression model on these latent variables while assuming normality of the error terms leads to the well-known generalised linear model with a probit link. A more general regression approach is considered instead, in which an I-prior on the regression function, which lies in some reproducing kernel Hilbert space, is assumed. An I-prior distribution is Gaussian with mean chosen a priori, and covariance equal to the Fisher information for the regression function. By working with I-priors, the benefits of the methodology are brought over to the binary case - one of which is that it provides a unified model-fitting framework that includes additive models, multilevel models and models with one or more functional covariates. The challenge is in the estimation, and a variational approximation is employed to overcome the intractable likelihood. Several real-world examples are presented from analyses conducted in R.

I-priors are a class of objective priors for regression functions which makes use of its Fisher information in a function space framework. Currently, I am exploring the use of I-priors in Bayesian variable selection. My talk is a collection of ideas and methods that I picked up along the way in researching my work, in the hopes that it might be of interest and some use in the areas you are working on: 1) Estimation of I-prior models using likelihood methods; 2) The R/iprior package for fitting I-prior models; 3) Shrinkage properties of I-priors and how they link to L2 penalties with individual shrinkage parameters (and equivalently, individual variance hyper-parameters in a Bayesian setting); 4) Estimation of I-prior models in a fully-Bayes setting, with particular interest in the scale parameters; 5) Using Hamiltonian Monte Carlo to obtain better quality MCMC chains for the Bayesian I-prior model. I will also share some information on useful tools and software for reproducible research that I came across during my work, including Shiny apps, GitHub, RStudio (for package development), knitr, and Stan.

In a previous work, I showed that the use of I-priors in various linear models can be considered as a solution to the over-fitting problem. In that work, estimation was still done using maximum likelihood, so in a sense it was a kind of frequentist-Bayes approach. Switching over to a fully Bayesian framework, we now look at the problem of variable selection, specifically in an ordinary linear regression setting. The appeal of Bayesian methods are that it reduces the selection problem to one of estimation, rather than a true search of the variable space for the model that optimises a certain criterion. I will talk about several Bayesian variable selection methods out there in the literature, and how we can make use of I-priors to improve on results in the presence of multicollinearity.

The I-prior methodology is a new modelling technique which aims to improve on maximum likelihood estimation of linear models when the dimensionality is large relative to the sample size. By putting a prior which is informed by the dataset (as opposed to a subjective prior), advantages such as model parsimony, lesser model assumptions, simpler estimation, and simpler hypothesis testing can be had. By way of introducing the I-prior methodology, we will give examples of linear models estimated using I-priors. This includes multiple regression models, smoothing models, random effects models, and longitudinal models. Research into this area involve extending the I-prior methodology to generalised linear models (e.g. logistic regression), Structural Equation Models (SEM), and models with structured error covariances.

© 2019 · Powered by the Academic theme for Hugo.