Statistics Seminar

Spring 2019

Seminars are held from 4:00 p.m. – 5:00 p.m. in Griffin-Floyd 100 unless otherwise noted.Refreshments are available before the seminars from 3:30 p.m. – 4:00 p.m. in Griffin-Floyd Hall 103.

Date Speaker    Title
Mar 14 Lingzhou Xue 

(Penn State University)

Fisher’s Combined Probability Tests for Complex, High Dimensional Data

Mar 21 Anindya Bhadra

(Purdue University)

Horseshoe Regularization for Prediction and Inverse Covariance Estimation

Apr 16 Joseph W Hogan

(Brown University)

Using Electronic Health Records Data for Predictive and Causal Inference About the HIV Care Cascade



Fisher’s Combined Probability Tests for Complex, High Dimensional Data
Lingzhou Xue, Penn State University


In the past decade, two sets of test statistics are widely used for high-dimensional hypothesis tests: 1) using extreme-value form statistics to test against sparse alternatives, and 2) using quadratic form statistics to test against dense alternatives with small disturbances. However, quadratic form statistics suffer from low power against sparse alternatives, and extreme-value form statistics suffer from low power against dense alternatives. For various real-world applications, it is important to develop the power enhancement testing procedures. In this talk, we provide a completely new perspective by studying the asymptotic joint distribution of quadratic form statistics and extreme-value form statistics. Based on their explicit joint limiting laws, we follow the philosophy of Fisher’s method to develop power enhancement tests for high-dimensional means, banded covariances, spiked covariances, and multi-factor pricing models. We prove that Fisher’s combined probability tests boost the power against more general alternatives while retaining the correct asymptotic size. We demonstrate the finite-sample performance of our proposed testing procedures in both simulation studies and real applications.

Horseshoe Regularization for Prediction and Inverse Covariance Estimation
Anindya Bhadra, Purdue University

Since the advent of the horseshoe priors for regularization, global-local shrinkage methods have proved to be a fertile ground for the development of Bayesian theory and methodology in high-dimensional problems. They have achieved remarkable success in computation, and enjoy strong theoretical support. Much of the existing literature, however, has focused on estimation and variable selection results in the sparse normal means model. The purpose of the current talk is to demonstrate that the horseshoe priors are useful more broadly, by considering two different directions. First, we consider finite sample (finite n, finite p>n) prediction risk results in linear regression and explicitly point out under what circumstances can horseshoe regression be expected to outperform global shrinkage methods, such as ridge regression in prediction. Second, we develop a multivariate extension of the horseshoe for estimating the precision matrix in graphical models that we term the graphical horseshoe estimator.

 Using Electronic Health Records Data for Predictive and Causal Inference About the HIV Care Cascade
Joseph W Hogan, Brown University
The HIV care cascade is a conceptual model describing essential steps in the continuum of HIV care. The cascade framework has been widely applied to define population-level metrics and milestones for monitoring and assessing strategies designed to identify new HIV cases, link individuals to care, initiate antiviral treatment, and ultimately suppress viral load.
Comprehensive modeling of the entire cascade is challenging because data on key stages of the cascade are sparse. Many approaches rely on simulations of assumed dynamical systems, frequently using data from disparate sources as inputs. However growing availability of large-scale longitudinal cohorts of individuals in HIV care affords an opportunity to develop and fit coherent statistical models using single sources of data, and to use these models for both predictive and causal inferences.

Using data from 90,000 individuals in HIV care in Kenya, we model progression through the cascade using a multistate transition model fitted using Bayesian Additive Regression Trees (BART), which allows considerable flexibility for the predictive component of the model. We show how to use the fitted model for predictive inference about important milestones and causal inference for comparing treatment policies. Connections to agent-based mathematical modeling are made.

This is joint work with Yizhen Xu, Tao Liu, Rami Kantor and Ann Mwangi