Recent & Upcoming Talks

2026

Haziq Jamil

5 Feb 2026 3:50 PM — 4:08 PM University of Padua, Italy

Approximate Bayesian Inference for Structural Equation Models

Markov chain Monte Carlo (MCMC) methods are, of course, the mainstay of Bayesian estimation of structural equation models (SEM). In this talk, I present ongoing work on an approximate Bayesian approach to SEM, drawing on ideas from the integrated nested Laplace approximation (INLA; Rue et al., 2009, J. R. Stat. Soc. Series B Stat. Methodol.) framework. The method applies a Laplace approximation to the joint posterior density in a transformed parameter space using a Gaussian centered at the mode, followed by marginal approximations with either two-piece asymmetric Gaussians or skew-normal densities. Quantities such as factor scores and model-fit indices are then obtained through post-hoc sampling. When the posterior density can be evaluated efficiently—as in the case of normal-theory SEM—this INLA-based approach provides a computationally efficient and highly accurate alternative to sampling-based inference. The implementation achieves near-‘maximum likelihood’ speeds while retaining the accuracy of full Bayesian inference, offering a practical and scalable route to Bayesian SEM within standard statistical software environments.

Haziq Jamil

2 Feb 2026 12:30 PM — 2:00 PM Leverhulme Library, COL.6.15, LSE

Approximate Bayesian Inference for Structural Equation Models

Markov chain Monte Carlo (MCMC) methods remain the mainstay of Bayesian estimation of structural equation models (SEM); however they often incur a high computational cost. In this talk, I present a bespoke approximate Bayesian approach to SEM, drawing on ideas from the integrated nested Laplace approximation (INLA; Rue et al., 2009, J. R. Stat. Soc. Series B Stat. Methodol.) framework. We implement a simplified Laplace approximation that efficiently profiles the posterior density in each parameter direction while correcting for asymmetry, allowing for parametric skew normal estimation of the marginals. Furthermore, we apply a variational Bayes correction to shift the marginal locations, thereby better capturing the posterior mass. Essential quantities, including factor scores and model-fit indices, are obtained via an efficient Gaussian copula sampling scheme. For normal-theory SEM, this approach offers a highly accurate alternative to sampling-based inference, achieving near-‘maximum likelihood’ speeds while retaining the precision of full Bayesian inference.

2025

Haziq Jamil

16 Oct 2025 2:00 PM — 3:00 PM KAUST, Saudi Arabia

Bias-Reduced Estimation of Structural Equation Models

Finite-sample bias is a pervasive challenge in the estimation of structural equation models (SEMs), especially when sample sizes are small or measurement reliability is low. A range of methods have been proposed to improve finite-sample bias in the SEM literature, ranging from analytic bias corrections to resampling-based techniques, with each carrying trade-offs in scope, computational burden, and statistical performance. This talk discusses the application of the reduced-bias M-estimation framework (RBM, Kosmidis & Lunardon, 2024, J. R. Stat. Soc. Series B Stat. Methodol.) to SEMs. The RBM framework requires only first- and second-order derivatives of the log-likelihood, which renders it both straightforward to implement and computationally more efficient compared to resampling-based alternatives such as bootstrap and jackknife. It is also robust to departures from modelling assumptions. Through extensive simulation studies, we illustrate that RBM estimators consistently reduce mean bias in the estimation of SEMs without inflating mean squared error. They also deliver improvements in both median bias and inference relative to maximum likelihood estimators, while maintaining robustness under non-normality.

Haziq Jamil

14 Jun 2025 9:00 AM — 11:30 AM Universiti Teknologi MARA (UiTM) Sarawak Branch, Online

Calculus: Differentiation and its Applications

Applications of calculus from a statistical perspective.

Haziq Jamil

21 May 2025 9:00 AM — 11:30 AM Authority for Info-communications Technology Industry (AITI)

R for Data Science

Welcome to our introductory session on R software and its powerful ecosystem. Learn how R can be used to produce compelling data products, such as reports with rich tables and visualizations. We’ll take a hands-on approach and even explore how to recreate parts of the AITI ICT Survey Report 2022 using R and Quarto.

2024

Haziq Jamil

18 Oct 2024 3:00 PM — 4:00 PM Department of Statistics and Data Science, NUS, Singapore

Weighted pairwise likelihood goodness-of-fit tests for binary factor models

Limited information goodness-of-fit (LIGOF) tests are increasingly recognized for their application in high-dimensional multivariate categorical data analysis. LIGOF tests address sparsity in contingency tables by leveraging summary statistics derived from univariate and bivariate residuals, effectively circumventing the reliability concerns associated with traditional goodness-of-fit tests. Previous studies on binary factor models have predominantly utilised maximum likelihood estimation, which itself can be computationally intensive when fitting large and complex models. This work examines the efficacy of LIGOF tests when composite likelihood estimation, specifically pairwise likelihood estimation, is used instead. Pairwise likelihood estimation offers a beneficial trade-off between computational efficiency and modelling accuracy in factor models, and hence the performance of LIGOF tests under this framework is of significant interest. The tests under consideration are based on quadratic forms of the residuals, including the classical Wald and Pearson tests. Modifications of these tests are also proposed, with the aim of further reducing computational complexity. Moreover, the study is expanded to include scenarios that involve complex sampling procedures with known weights, thereby broadening the applicability of our findings.

Haziq Jamil

4 Sep 2024 2:00 PM — 3:00 PM Universiti Brunei Darussalam

Item Response Theory (IRT) models: Reducing bias in small samples

In this talk, we will explore the concept of Item Response Theory (IRT), a powerful method used to understand how different test items (questions) work in educational assessments. IRT helps educators and researchers measure students’ abilities more accurately by looking at how they respond to various test items. We will introduce the basics of IRT, explaining how it can be applied in the context of educational testing, and how it provides valuable insights into both the test questions and the students taking the test. The talk will demonstrate how to fit IRT models using R software, making it easier for attendees to apply these techniques in their own work. Additionally, we will discuss methods for reducing bias in IRT models, especially in situations where small sample sizes might otherwise lead to inaccurate results. This presentation introduces a method to correct this bias by using empirical-based adjustments. Our approach is simple and significantly improves the accuracy of IRT model results, making it valuable for both researchers and practitioners. The method can be easily applied and offers a straightforward alternative to more complex techniques. Simulation studies show that our method effectively reduces bias, leading to more precise and reliable measurements in psychometric assessments.

Haziq Jamil

19 Jul 2024 11:30 AM — 11:45 AM Vysoká škola ekonomická v Praze

Empiral bias-reducing adjustments for Item Response Theory (IRT) models

In the field of psychometrics, the accuracy and reliability of measurement tools are paramount, particularly when employing Item Response Theory (IRT) models for assessing latent psychological traits. A persistent challenge in this domain is the non-zero bias of order $O(1/n)$ in finite sample sizes, a problem aggravated by deviations from the latent normality assumption, such as excess zeroes or skewed distributions. This presentation introduces an empirical bias adjustment method designed to mitigate this problem. The method applies adjustments derived from the empirical approximation of bias through higher-order derivatives of the estimating functions. Our simple approach offers a promising avenue for enhancing the robustness of IRT model estimations, especially in samples that deviate from idealized assumptions. The method’s theoretical advantages include markedly improved accuracy of estimator recovery, rendering it an invaluable asset for both researchers and practitioners. The innovation lies in its straightforward adjustment process, which can be implemented via implicit (i.e. solving adjusted estimating equations) or explicit methods (i.e. adjusting original estimators), thus streamlining the adoption and offering an appealing alternative to existing, more complex bias-reduction techniques. Validation of our theoretical framework through simulation studies confirms the effectiveness of our empirical bias adjustment in reducing parameter bias, thereby enabling more precise and dependable psychometric measurements.

Haziq Jamil

15 Apr 2024 2:00 PM — 3:00 PM Data Analysis and Statistical Science, Ghent University, Belgium

Weighted pairwise likelihood goodness-of-fit tests for binary factor models

2023

Haziq Jamil

26 Oct 2023 1:00 PM — 2:00 PM Department of Statistics, LSE

Spatio-temporal modelling of property prices in Brunei Darussalam

Haziq Jamil

9 Aug 2023 2:00 PM — 3:00 PM FOS, UBD

Pairwise likelihood goodness of fit tests for binary factor models

Haziq Jamil

27 Jul 2023 4:10 PM — 5:25 PM University of Maryland

Pairwise likelihood goodness of fit tests for binary factor models

Limited information goodness of fit (GOF) tests have gained recognition in the literature for high-dimensional multivariate categorical data analysis. Sparsity issues in the ensuing contingency tables impair the dependability of GOF tests but can be circumvented by considering summary statistics involving univariate and bivariate residuals. Prior work in this area for factor models have focused mainly on maximum likelihood estimation, which itself can be computationally intensive when fitting large and complex models. This present work examines limited information GOF tests when composite likelihood estimation, specifically pairwise likelihood estimation, is used instead. Pairwise likelihood estimation offers a beneficial trade-off between computational efficiency and modelling accuracy in factor models, and hence we wanted to examine the performance of limited information GOF tests under this framework. The tests under consideration are based on the Pearson chi-squared test statistic and the Wald test statistic. We propose modifications to each of these tests with the aim of further reducing computational complexity. We then extend our findings beyond independent sampling to situations where complex sampling procedures (with known weights) are employed.

2022

Haziq Jamil

16 Nov 2022 3:00 PM — 1 Feb 2018 4:00 PM Faculty of Science, NUS, Singapore

Regression modelling using I-priors

Regression analysis is undoubtedly an important tool to understand the relationship between one or more explanatory and independent variables of interest. The problem of estimating a generic regression function in a model with normal errors is considered. For this purpose, a novel objective prior for the regression function is proposed, defined as the distribution maximizing entropy (subject to a suitable constraint) based on the Fisher information on the regression function. This prior is called the I-prior. The regression function is then estimated by its posterior mean under the I-prior, and accompanying hyperparameters are estimated via maximum marginal likelihood. Estimation of I-prior models is simple and inference straightforward, while predictive performances are comparative, and often better, to similar leading state-of-the-art models–as will be illustrated by several data examples. Further plans for research in this area are also presented, including variable selection for interaction effects and extending the I-prior methodology to non-Gaussian errors.

Wicher Bergsma, Haziq Jamil

15 Jul 2022 9:15 AM — 10:15 AM University of Bologna, Italy

Selecting interaction effects in additive models using I-priors

Additive models with interactions have been considered extensively in the literature, using estimation methods such as splines or Gaussian process regression. We present an alternative empirical-Bayes approach to selecting interaction effects using the I-prior approach introduced by Bergsma (2020). Using a parsimonious formulation of hierarchical interaction spaces, model selection is simplified. Furthermore, we present an efficient EM algo- rithm for estimating key hyperparameters. Simulations for linear regressions indicate competitive performance with methods such as the lasso and Bayesian variable selection using spike and slab priors or g-priors. However, our methodology is more gen- eral and can also be used with interacting nonlinear regression functions.

2021

Haziq Jamil

14 Nov 2021 3:00 PM — 4:15 PM Online event

MINDEF Scholars Sharing Session: Life After MINDEF

Sharing experiences about life after leaving the defence sector.

2020

Haziq Jamil

19 Nov 2020 2:30 PM — 3:30 PM Lecture Room 2 (D2.8), Block D, Integrated Science Building, UBD

A latent variable model for maximal performance testing with dropouts for military applications

Soldiers are expected to perform complex and demanding tasks during operations, often while carrying a heavy load. It is therefore important for commanders to understand the relationship between load carriage and soldiers’ performance, as such knowledge helps inform decision-making on training policies, operational doctrines, and future soldier systems requirements. In order to investigate this, repeated experiments were conducted to capture key soldier performance parameters under controlled conditions. The data collected was found to contain missing values due to dropouts as well as non-measurement. We propose a Bayesian structural equation model to quantify a latent variable representing soldiers’ abilities, while taking into consideration the non-random nature of the dropouts and time-varying effects. This talk describes the modelling exercise conducted, emphasising the statistical model-building process as well as the practical reporting of the outputs of the model.

Haziq Jamil

28 May 2020 1:00 PM — 2:00 PM Online

Investigating the effect of load carriage on soldiers’ performances using structural equation models

Soldiers are required to perform tasks that call upon a complex combination of their physical and cognitive capabilities. For example, soldiers are expected to communicate effectively with each other, operate specialised equipment, and maintain overall situational awareness–often while carrying a heavy load. From a planning and doctrine perspective, it is important for commanders to understand the relationship between load carriage and soldiers’ performance. Such information could help provide recommendations in advising future policies on training, operational safety, and future soldier systems requirements. To this end, the Royal Brunei Armed Forces (RBAF) conducted controlled experiments and collected numerous measurements intended to capture key soldier performance parameters. The structure of the data set provided several interesting challenges, namely 1) how do we define “performance”?; 2) how do we appropriately take into account the longitudinal nature of the data (repeated measurements)?; and 3) how do we handle non-ignorable dropouts? We propose a structural equation model to quantify a latent variable representing soldiers’ abilities, while taking into consideration the non-random nature of the dropouts and time-varying effects. The main output of the study is to quantify the relationship between load carried versus performance. Additionally, modelling the dropouts allow us to also determine “expected time to exhaustion” for a given load carried by a soldier.

2019

Haziq Jamil

13 Nov 2019 2:00 PM — 3:00 PM G.10, UBDSBE, Brunei

Bayesian Variable Selection for Linear Models

In statistical modelling, there is often a genuine interest to learn the most reasonable, parsimonious, and interpretable model that fits the data. This is especially true when faced with the oddly perplexing phenomenon of having “too much information” (data saturation). Model selection is indeed a vastly covered topic. In this talk, I will focus on the Bayesian approach to model selection, emphasising the selection of variables in a linear regression model. The outcome of the talk is three-fold: 1) To introduce the statistical framework for Bayesian variable selection; 2) to understand how we can use model probabilities as a basis for model selection; and 3) to demonstrate its application using real-world data (mortality and air pollution data). The hope is that the audience will gain an understanding of the method to possibly spur on further research and applications in their respective work."

Haziq Jamil

25 Jan 2019 9:00 AM — 10:00 AM Ministry of Defence, Brunei

Misconceptions in Demography

Inspired by the Gapminder project, let’s talk about common misconceptions about demography.

2018

Haziq Jamil

4 Dec 2018 2:30 PM — 1 Feb 2018 3:00 PM Faculty of Science, UBD, Brunei

A Brief Guide to Variational Inference

The fitting of complex statistical models that consists of latent or nuisance variables, in addition to various parameters to be estimated, likely involves overcoming an intractable integral. For instance, calculating the likelihood of such models require marginalising over the latent variables, and this may prove to be difficult computationally, either due to dimensionality or model design. Variational inference, or variational Bayes as it is also known, offers an efficient alternative to Markov chain Monte Carlo methods, the Laplace approximation, and quadrature methods. Rooted in Bayesian inference and popularised in machine learning, the main idea is to overcome the difficulties faced by working with “easy” density functions in lieu of the true posterior distribution. The approximating density function is chosen so as to minimise the (reverse) Kullback-Leilber divergence between them. The topics that will be discussed are mean-field distributions, the coordinate ascent algorithm, and approximation properties, with an example following. The hope is that the audience will gain a basic understanding of the method to possibly spur on further research and applications in their respective work.

Haziq Jamil

27 Mar 2018 12:30 PM — 2:00 PM LSE, London, United Kingdom

Binary and Multinomial Regression using Fisher Information Covariance Kernels (I-priors)

In a regression setting, we define an I-prior as a Gaussian process prior on the regression function with covariance kernel equal to its Fisher information. We present some methodology and computational work on estimating regression functions by working in the appropriate reproducing kernel Hilbert space of functions and assuming an I-prior on the function of interest. In a regression model with normally distributed errors, estimation is simple—maximum likelihood and the EM algorithm is employed. In the classification models (categorical response models), estimation is performed using variational inference. I-prior models perform comparatively well, and often better, to similar leading state-of-the-art models for use in prediction and inference. Applications are plentiful, including smoothing models, modelling multilevel data, longitudinal data, functional covariates, multi-class classification, and even spatiotemporal modelling.

Haziq Jamil

1 Feb 2018 12:30 PM — 2:00 PM LSE, London, United Kingdom

A Beginner's Guide to Variational Inference

Estimation of complex models that consists of latent variables and various parameters, in addition to the data that is observed, might involve overcoming an intractable integral. For instance, calculating the likelihood of such models require marginalising over the latent variables, and this may prove to be difficult computationally—either due to model design or dimensionality. Variational inference, or variational Bayes as it is also known, offers an efficient alternative to Markov chain Monte Carlo methods, the Laplace approximation, and quadrature methods. Rooted in Bayesian inference and popularised in machine learning, the main idea is to overcome the difficulties faced by working with “easy” density functions in lieu of the true posterior distribution. The approximating density function is chosen so as to minimise the (reverse) Kullback-Leilber divergence between them. The topics that will be discussed are mean-field distributions, the coordinate ascent algorithm, and its properties, with examples following. The hope is that the audience will gain a basic understanding of the method to possibly spur on further research and applications in their respective work.

2017

Wicher Bergsma, Haziq Jamil

18 Jul 2017 1:30 PM — 3:00 PM University of Zürich, Switzerland

Regression Modelling with I-Priors

This is an overview of a unified methodology for fitting parametric and nonparametric regression models, including additive models, multilevel models, and models with one or more functional covariates. We also discuss an associated R-package called iprior. An I-prior is an objective prior for the regression function, and is based on its Fisher information. The regression function is estimated by its posterior mean under the I-prior, and scale parameters are estimated via maximum marginal likelihood using an Expectation-Maximization (EM) algorithm. Regression modelling using I-priors has several attractive features: it requires no assumptions other than those pertaining to the model of interest; estimation and inference is relatively straightforward; and small and large sample performance can be better than Tikhonov regularization. We illustrate the use of the iprior package by analysing three well- known data sets, in particular, a multilevel data set, a longitudinal data set, and a dataset involving a functional covariate.

Haziq Jamil

8 May 2017 12:00 PM — 12:35 PM LSE

Binary probit regression with I-priors

An extension of the I-prior methodology to binary response data is explored. Starting from a latent variable approach, it is assumed that there exists continuous, auxiliary random variables which decide the outcome of the binary responses. Fitting a classical linear regression model on these latent variables while assuming normality of the error terms leads to the well-known generalised linear model with a probit link. A more general regression approach is considered instead, in which an I-prior on the regression function, which lies in some reproducing kernel Hilbert space, is assumed. An I-prior distribution is Gaussian with mean chosen a priori, and covariance equal to the Fisher information for the regression function. By working with I-priors, the benefits of the methodology are brought over to the binary case - one of which is that it provides a unified model-fitting framework that includes additive models, multilevel models and models with one or more functional covariates. The challenge is in the estimation, and a variational approximation is employed to overcome the intractable likelihood. Several real-world examples are presented from analyses conducted in R.

2016

Haziq Jamil

3 Nov 2016 12:30 PM — 2:00 PM LSE, London, United Kingdom

I-priors in Bayesian Variable Selection: From Reproducing Kernel Hilbert Spaces to Hamiltonian Monte Carlo

I-priors are a class of objective priors for regression functions which makes use of its Fisher information in a function space framework. Currently, I am exploring the use of I-priors in Bayesian variable selection. My talk is a collection of ideas and methods that I picked up along the way in researching my work, in the hopes that it might be of interest and some use in the areas you are working on: 1) Estimation of I-prior models using likelihood methods; 2) The R/iprior package for fitting I-prior models; 3) Shrinkage properties of I-priors and how they link to L2 penalties with individual shrinkage parameters (and equivalently, individual variance hyper-parameters in a Bayesian setting); 4) Estimation of I-prior models in a fully-Bayes setting, with particular interest in the scale parameters; 5) Using Hamiltonian Monte Carlo to obtain better quality MCMC chains for the Bayesian I-prior model. I will also share some information on useful tools and software for reproducible research that I came across during my work, including Shiny apps, GitHub, RStudio (for package development), knitr, and Stan.

2015

Haziq Jamil

18 Nov 2015 12:30 PM — 2:00 PM LSE, London, United Kingdom

Two-stage Bayesian variable selection for linear models using I-priors

In a previous work, I showed that the use of I-priors in various linear models can be considered as a solution to the over-fitting problem. In that work, estimation was still done using maximum likelihood, so in a sense it was a kind of frequentist-Bayes approach. Switching over to a fully Bayesian framework, we now look at the problem of variable selection, specifically in an ordinary linear regression setting. The appeal of Bayesian methods are that it reduces the selection problem to one of estimation, rather than a true search of the variable space for the model that optimises a certain criterion. I will talk about several Bayesian variable selection methods out there in the literature, and how we can make use of I-priors to improve on results in the presence of multicollinearity.

Haziq Jamil

19 May 2015 12:00 PM — 12:35 PM LSE, London, United Kingdom

Regression Modelling using I-Priors

The I-prior methodology is a new modelling technique which aims to improve on maximum likelihood estimation of linear models when the dimensionality is large relative to the sample size. By putting a prior which is informed by the dataset (as opposed to a subjective prior), advantages such as model parsimony, lesser model assumptions, simpler estimation, and simpler hypothesis testing can be had. By way of introducing the I-prior methodology, we will give examples of linear models estimated using I-priors. This includes multiple regression models, smoothing models, random effects models, and longitudinal models. Research into this area involve extending the I-prior methodology to generalised linear models (e.g. logistic regression), Structural Equation Models (SEM), and models with structured error covariances.