Professor of Statistics
The Wharton School
University of Pennsylvania
Ed George is Universal Furniture Professor and Professor in the Department of Statistics at The Wharton School at the University of Pennsylvania. He received his PhD from Stanford University in 1981. He is a fellow of the American Statistical Association (ASA) 1997 ,and a Fellow of the Institute of Mathematical Statistics (IMS) 1995. He received the CBA Foundation Award for Oustanding Research Contributions in 1998 and the CBA Foundation Award for Research Excellence 1995. He also received the Excellence in Education Award in 2001. and the Joe D. Beasley Award for Teaching Excellence 1996, The University of Texas at Austin McKinsey Award for Excellence in Teaching 1987 and the Emory Williams Award for Excellence in Teaching 1987, The University of Chicago He was the Executive Editor of the Statistical Science 2004-2007.
(November 29, 2012)
Discovering Regression Structure with a Bayesian Ensemble
A Bayesian ensemble can be used to discover and learn about the regression relationship between a variable of interest y, and vector of p potential predictor variables x. The basic idea is to model the conditional distribution of y given x by a sum of random basis elements plus a flexible noise distribution. In particular, I will focus on a Bayesian ensemble approach called BART (Bayesian Additive Regression Trees). Based on a basis of random regression trees, BART automatically produces a predictive distribution for y at any x (in or out of sample) which automatically adjusts for the uncertainty at each such x. It can do this for nonlinear relationships, even those hidden within a large number of irrelevant predictors. Further, BART opens up a novel approach for model free variable selection. Ultimately, the information provided such a Bayesian ensemble may be seen as a valuable first step towards model building for high dimensional data. (This is joint work with H. Chipman and R. McCulloch).
(November 30, 2012)
EMVS: The EM Approach to Bayesian Variable Selection
Despite rapid developments in stochastic search algorithms, the practicality of Bayesian variable selection methods has continued to pose challenges. High-dimensional data are now routinely analyzed, typically with many more covariates than observations. To broaden the applicability of Bayesian variable selection for such high-dimensional linear regression contexts, we propose EMVS, a deterministic alternative to stochastic search based on an EM algorithm which exploits a conjugate mixture prior formulation to quickly find posterior modes. Combining a spike-and-slab regularization diagram for the discovery of active predictor sets with subsequent rigorous evaluation of posterior model probabilities, EMVS rapidly identifies promising sparse high posterior probability submodels. External structural information such as likely covariate groupings or network topologies is easily incorporated into the EMVS framework. Deterministic annealing variants are seen to improve the effectiveness of our algorithms by mitigating the posterior multi-modality associated with variable selection priors. The usefulness the EMVS approach is demonstrated on real high-dimensional data, where computational complexity renders stochastic search to be less practical. (This is joint work with Veronika Rockova).