event

Statistics Seminar - Non-sparse methods for out-of-sample prediction in high-dimensional linear models

Primary tabs

TITLE: Non-sparse methods for out-of-sample prediction in high-dimensional linear models

SPEAKER:  Dr. Lee Dicker

ABSTRACT:

Motivated by questions about dense (non-sparse) signals in high-dimensional data analysis, we study the unconditional out-of-sample prediction error (predictive risk) associated with three classes of dense estimators for high-dimensional linear models: Ridge regression estimators, scalar multiples of the ordinary least squares estimator (which we refer to as James-Stein estimators), and marginal regression estimators. Our results require no assumptions about sparsity and imply that in linear models where the number of predictors is roughly proportional to the number of observations: (i) If the population predictor covariance is known (or if a norm-consistent estimator is available), then the ridge estimator outperforms the James-Stein estimator; (ii) both the ridge and James-Stein estimators outperform the ordinary least squares estimator, and the improvements offered by these estimators are especially significant when the signal-to-noise ratio is small; and (iii) the marginal estimator has serious deficiencies for out-of-sample prediction. We derive new closed-form expressions for the asymptotic predictive risk of the estimators, which allow us to precisely quantify the previous claims. Additionally, minimax ridge and James-Stein estimators are identified. Finally, we argue that the ridge estimator is, in fact, asymptotically optimal among dense estimators for out-of-sample prediction in high-dimensional linear models.

Contact: Lee Dicker <ldicker@stat.rutgers.edu>

Status

  • Workflow Status:Published
  • Created By:Anita Race
  • Created:01/09/2012
  • Modified By:Fletcher Moore
  • Modified:10/07/2016

Keywords

  • No keywords were submitted.