Fall 2019
Seminars are held from 4:00 p.m. – 5:00 p.m. in GriffinFloyd 100 unless otherwise noted.Refreshments are available before the seminars from 3:30 p.m. – 4:00 p.m. in GriffinFloyd Hall 103.
Date  Speaker  Title 

Sep 5  Bikas Sinha
(Indian Statistical Institute) 
Reliability Estimation in Exponential Populations 
Sep 26  Yuexiao Dong
(Temple University) 
On dual modelfree variable selection with two groups of variables 
Oct 17  Yiyuan She
(Florida State University) 
Supervised Clustering with Low Rank 
Nov 14  Matey Neykov
(Carnegie Mellon University) 
Minimax Optimal Conditional Independence Testing 
Nov 21  Robert Strawderman
(University of Rochester) 
Robust Qlearning 
Abstracts
Reliability Estimation in Exponential Populations
Bikas Sinha, Indian Statistical Institute
Abstract : We confine only to simple exponential population as the life distribution of a given product. We confine to point estimation of reliability R(t) at a given / specified time point (t, t > 0). With R(t)= Pr[ X > t]. We confine to unbiasedness criterion in the exact sense. E[R^(t)]=R(t). We discuss some understanding about the theory towards development of the subject matter. Computations are routine and are illustrated by a few examples.


On dual modelfree variable selection with two groups of variables 
Yuexiao Dong, Temple University
Abstract: In the presence of two groups of variables, existing modelfree variable selection methods only reduce the dimensionality of the predictors. We extend the popular marginal coordinate hypotheses Cook (2004) in the sufficient dimension reduction literature and consider the dual marginal coordinate hypotheses, where the role of the predictor and the response is not important. Motivated by canonical correlation analysis (CCA), we propose a CCAbased test for the dual marginal coordinate hypotheses, and devise a joint backward selection algorithm for dual modelfree variable selection. The performances of the proposed test and the variable selection procedure are evaluated through synthetic examples and a real data analysis. 
Yiyuan She, Florida State University
Modern clustering applications are often faced with challenges from high dimensionality, nonconvexity and parameter tuning. This paper gives a mathematical formulation of lowrank supervised clustering and can automatically group the predictors in building a multivariate predictive model. By use of linearization and block coordinate descent, a simpletoimplement algorithm is developed, which performs subspace learning and clustering iteratively with guaranteed convergence. We show a tight error bound of the proposed method, study its minimax optimality, and propose a new information criterion for parameter tuning, all with distinctive rates from the large body of literature based on sparsity. Extensive simulations and realdata experiments demonstrate the excellent performance of rankconstrained inherent clustering. 
Matey Neykov, Carnegie Mellon University
We consider the problem of conditional independence testing of $X$ and $Y$ given $Z$ where $X,Y,Z$ are three real random variables and $Z$ is continuous. We focus on two main cases — when $X$ and $Y$ are both discrete, and when $X$ and $Y$ are both continuous. In view of recent results on conditional independence testing (Shah and Peters, 2018), one cannot hope to design nontrivial tests, which control the Type I error while still ensuring power against interesting alternatives, for all absolutely continuous distributions. Consequently, we identify various, natural smoothness assumptions on the conditional distributions of $X,YZ=z$ as $z$ varies in the support of $Z$, and study the hardness of conditional independence testing under these smoothness assumptions. We derive matching lower and upper bounds on the critical radius of separation between the null and alternative hypothesis in the total variation metric. The tests we consider are easily implementable and rely on binning the support of the continuous variable $Z$. To complement these results, we provide a new proof of the hardness result of (Shah and Peters, 2018) and show that conditional independence testing remains difficult even when $X,Y$ are discrete variables of finite (and not scaling with the samplesize) support.

Robert Strawderman, University of Rochester
Abstract: Qlearning is a regressionbased approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of these working models can result in residual confounding and/or significant efficiency loss. We propose a robust Qlearning approach which allows estimating such nuisance parameters using dataadaptive techniques. Methodology, asymptotics and simulations will be summarized and highlight the utility of the proposed methods in practice. Data from the “Extending Treatment Effectiveness of Naltrexone” multistage randomized trial will be used to illustrate the proposed methods. This is joint work with Ashkan Ertefaie. 