Nominate a speaker »

**Seminars will be held In Person unless otherwise noted**

Spring 2025

Seminar One

February 27, 2025
Griffin Floyd Hall, Room 100

Peijun Sang
University of Waterloo

Title:

Functional principal component analysis with informative observation times

Abstract:

Functional principal component analysis has been shown to be invaluable for revealing variation modes of longitudinal outcomes, which serves as important building blocks for forecasting and model building. Decades of research have advanced methods for functional principal component analysis often assuming independence between the observation times and longitudinal outcomes. Yet such assumptions are fragile in real-world settings where observation times may be driven by outcome-related reasons. Rather than ignoring the informative observation time process, we explicitly model the observational times by a general counting process dependent on time-varying prognostic factors. Identification of the mean, covariance function, and functional principal components ensues via inverse intensity weighting. We propose using weighted penalized splines for estimation and establish consistency and convergence rates for the weighted estimators. Simulation studies demonstrate that the proposed estimators are substantially more accurate than the existing ones in the presence of a correlation between the observation time process and the longitudinal outcome process. We further examine the finite-sample performance of the proposed method using the Acute Infection and Early Disease Research Program study.

Seminar Two

March 6, 2025
Griffin-Floyd Hall, Room 100

Alex Peterson
Brigham Young University

Title:

Modeling and Regularized Estimation of the Covariance Operator of Multivariate Functional Data

Abstract:

Functional MRI scans result in a set of regional BOLD signals for each subject, which can be modeled as multivariate functional data. The covariance operator of multivariate functional data is a complex object that can be difficult to estimate, especially if the multivariate dimension is large, making extensions of statistical methods for standard multivariate data to the functional data setting challenging. Compared with multivariate data, a key difficulty is that the covariance operator is compact and thus does not have a bounded inverse. This talk will address covariance modelling for multivariate functional data utilizing nested structures of separability, as well as regularized estimation of the functional precision operators.

Seminar Three

March 13, 2025
Zoom link: https://ufl.zoom.us/j/98708991173

Thomas Bartlett
University College, London

Title:

Using stochastic network theory to inform unsupervised learning from genomic count data.

Abstract:

Important tasks in the study of genomic data include the identification of groups of similar cells (for example by clustering), and visualization of data summaries (for example by dimensional reduction). In this talk, I will present a novel view of these tasks in the context of single-cell genomic data. To do so, I propose modelling the observed count-matrices of genomic data by representing these measurements as a bipartite network with multi-edges. Starting with this first-principles network model of the raw data, I will show improvements in clustering single cells via a suitably-identified d-dimensional Laplacian Eigenspace (LE) using a Gaussian mixture model (GMM-LE), and apply UMAP to non-linearly project the LE to two dimensions for visualization (UMAP-LE). From this first-principles viewpoint, the LE representation of the data-points estimates transformed latent positions (of genes and cells), under a latent position statistical model of nodes in a bipartite stochastic network. By applying this proposed methodology to data from three recent genomics studies in different biological contexts, I will show how clusters of cells independently learned by this proposed methodology are found to correspond to cells expressing specific marker genes that were independently defined by domain experts, with an accuracy that is competitive with the industry-standard for these data. I will then show how this novel view of these data can provide unique insights, leading to the identification of a LE breast-cancer biomarker that significantly predicts long-term patient survival outcome in two independent validation cohorts with data from 1904 and 1091 individuals.

Seminar Four

March 27, 2025
Griffin-Floyd Hall, Room 100

Rajarshi Guhaniyogi
Texas A & M University

Title:

Developing Functional Surrogates for Large Structured Data: Robust Distributed Bayesian Learning and Amortized Inference

Abstract:

The growing availability of large structured datasets presents both opportunities and challenges for statistical inference. While traditional Bayesian methods provide a principled framework for uncertainty quantification, they often struggle with the computational and scalability demands of such data. In this presentation, we explore two complementary approaches to overcoming these challenges: robust distributed Bayesian inference and amortized inference. The first approach introduces a theoretically grounded divide-and-conquer framework that ensures principled Bayesian inference while remaining robust to the choice of data partitioning. Additionally, this method enables privacy-preserving Bayesian inference in scenarios where data are stored across multiple centers without direct data sharing. The second approach integrates Bayesian inference with deep neural networks, enabling principled uncertainty quantification while contributing to the broader field of interpretable deep learning. Both approaches are illustrated with large-scale simulation from the Sea, Lake and Overland Surge from Hurricanes (SLOSH) simulator in collaboration with scientists from the Los Alamos National Laboratories.

Seminar Five

April 3, 2025
Griffin-Floyd Hall, Room 100

Victor Patrangenaru
Florida State University

Title:

On the origins of COVID-19, Vladimir Putin and Data Analysis on Stratified Spaces

Abstract:

This is a motivational talk for introducing the audience to two upcoming books: one on data analysis on stratified spaces (coauthored with Daniel E. Osborne-FAMU) and the other on statistical image analysis (coauthored with Robert L. Paige -Missouri S&T). We gratefully acknowledge support from NSF-DMS:2311059.

Seminar Six

April 17, 20215
Griffin-Floyd Hall, Room 100

Abhirup Dutta
Johns Hopkins University

Title:

Machine learning for geospatial data with explicit modeling of spatial correlations

Abstract:

Traditionally geospatial analysis has relied on statistical models that explicitly model spatial correlations in the data. Recently, machine learning algorithms, such as neural networks and random forests, are increasingly used in geospatial analysis. However, most machine learning algorithms do not possess the functionality to directly encode spatial correlations. There is limited understanding of the consequences of ignoring spatial correlations in machine learning algorithms applied to geospatial data, despite this practice becoming increasingly common. We show empirically and theoretically, that ignoring spatial correlations reduces the accuracy of machine learning algorithms for geospatial data.

We then propose well-principled machine learning algorithms for geospatial data that explicitly model the spatial correlation as in traditional geostatistics. The basic principle is guided by how ordinary least squares (OLS) extend to generalized least squares (GLS) for linear models to explicitly account for data covariance. We demonstrate how the same extensions can be done for random forests and neural networks, presenting the RF-GLS and NN-GLS algorithms. We provide extensive theoretical and empirical support for the methods and show how they fare better than naïve or brute-force approaches to use machine learning algorithms for spatially correlated data. We present the software packages RandomForestsGLS and geospaNN implementing these methods.

Statistics

Statistics Seminar

Nominate a speaker »

Spring 2025

Seminar One

Seminar Two

Seminar Three

Seminar Four

Seminar Five

Seminar Six

Links

Resources

Websites