Statistics Seminar

Fall 2019

Seminars are held from 4:00 p.m. – 5:00 p.m. in Griffin-Floyd 100 unless otherwise noted.Refreshments are available before the seminars from 3:30 p.m. – 4:00 p.m. in Griffin-Floyd Hall 103.

Date Speaker    Title
Sep 5 Bikas Sinha

(Indian Statistical Institute)

Reliability Estimation in Exponential Populations

Sep 26 Yuexiao Dong

(Temple University)

On dual model-free variable selection with two groups of variables

 Oct 17 Yiyuan She

(Florida State University)

Supervised Clustering with Low Rank

Nov 14 Matey Neykov

(Carnegie Mellon University)

Minimax Optimal Conditional Independence Testing

Nov 21 Robert Strawderman

(University of Rochester)

Robust Q-learning



Reliability Estimation in Exponential Populations
Bikas Sinha, Indian Statistical Institute
Abstract : We confine only to simple exponential population as the life distribution of a given product. We confine to point estimation of reliability R(t) at a given / specified time point (t, t > 0). With R(t)= Pr[ X > t]. We confine to unbiasedness criterion in the exact sense. E[R^(t)]=R(t). We discuss some understanding about the theory towards development of the subject matter. Computations are routine and are illustrated by  a few examples.
On dual model-free variable selection with two groups of variables
Yuexiao Dong, Temple University


Abstract: In the presence of two groups of variables, existing model-free variable selection methods only reduce the dimensionality of the predictors. We extend the popular marginal coordinate hypotheses Cook (2004) in the sufficient dimension reduction literature and consider the dual marginal coordinate hypotheses, where the role of the predictor and the response is not important. Motivated by canonical correlation analysis (CCA), we propose a CCA-based test for the dual marginal coordinate hypotheses, and devise a joint backward selection algorithm for dual model-free variable selection. The performances of the proposed test and the variable selection procedure are evaluated through synthetic examples and a real data analysis.

Yiyuan She, Florida State University

Modern clustering applications are often faced with challenges from high dimensionality, nonconvexity and parameter tuning. This paper gives a mathematical formulation of low-rank supervised clustering and can automatically group the predictors in building a multivariate predictive model.  By use of linearization and block coordinate descent, a simple-to-implement   algorithm is developed, which  performs subspace learning and clustering iteratively with guaranteed convergence. We show a tight error bound of the proposed method,   study its minimax optimality, and propose   a new information criterion  for parameter tuning, all with   distinctive   rates from the large body of literature  based on sparsity. Extensive simulations and real-data experiments demonstrate the excellent performance of rank-constrained inherent clustering.

Matey Neykov, Carnegie Mellon University

We consider the problem of conditional independence testing of $X$ and $Y$ given $Z$ where $X,Y,Z$ are three real random variables and $Z$ is continuous. We focus on two main cases — when $X$ and $Y$ are both discrete, and when $X$ and $Y$ are both continuous. In view of recent results on conditional independence testing (Shah and Peters, 2018), one cannot hope to design non-trivial tests, which control the Type I error while still ensuring power against interesting alternatives, for all absolutely continuous distributions. Consequently, we identify various, natural smoothness assumptions on the conditional distributions of $X,Y|Z=z$ as $z$ varies in the support of $Z$, and study the hardness of conditional independence testing under these smoothness assumptions. We derive matching lower and upper bounds on the critical radius of separation between the null and alternative hypothesis in the total variation metric. The tests we consider are easily implementable and rely on binning the support of the continuous variable $Z$. To complement these results, we provide a new proof of the hardness result of (Shah and Peters, 2018) and show that conditional independence testing remains difficult even when $X,Y$ are discrete variables of finite (and not scaling with the sample-size) support.



 Robert Strawderman, University of Rochester

Abstract: Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of these working models can result in residual confounding and/or significant efficiency loss. We propose a robust Q-learning approach which allows estimating such nuisance parameters using data-adaptive techniques. Methodology, asymptotics and simulations will be summarized and highlight the utility of the proposed methods in practice. Data from the “Extending Treatment Effectiveness of Naltrexone” multistage randomized trial will be used to illustrate the proposed methods.  This is joint work with Ashkan Ertefaie.