event

CSE Seminar: " Kernel Nonparametric Tests of Homogeneity, Independence and Multi-Variable Interaction" By: Arthur Greetton

Primary tabs

Title:

Kernel Nonparametric Tests of Homogeneity, Independence and Multi-Variable Interaction

Abstract:

We consider three nonparametric hypothesis testing problems: (1) Given samples from distributions p and q, a homogeneity test determines whether to accept or reject p=q; (2) Given a joint distribution pixy over random variables x and y, an independence test investigates whether pixy = p_x p_y, (3) Given a joint distribution over several variables, we may test for whether there exist factorization (e.g., P_xyz = P_xyP_z, or for the case of total independence, P_xyz=P_xP_yP_z). The final test (3) is of particular interest in fitting directed graphical models, as it may be used in detecting cases where two independent causes individually have weak influence on a third dependent variable, but their combined effect has a strong influence, even when these variables have high dimension.

We present nonparametric tests for the three cases described, based on distances between embeddings of probability measures to reproducing kernel Hilbert spaces (RKHS), which constitute the test statistics (e.g. for independence, the distance is between the embedding of the joint, and that of the product of the marginals). The tests benefit from decades of machine research on kernels for various domains, and thus apply to distributions on high dimensional vectors, images, strings, graphs, groups, and semigroups, among others. The energy distance and distance covariance statistics are particular instances of these RKHS statistics. Finally, the tests can be applied for time series data, using a wild bootstrap procedure to approximate the null hypothesis.

Bio

Arthur Gretton is a Reader (Associate Professor) with the Gatsby Computational Neuroscience Unit, CSML, UCL, which he joined in 2010. He received degrees in physics and systems engineering from the Australian National University, and a PhD with Microsoft Research and the Signal Processing and Communications Laboratory at the University of Cambridge. He worked from 2002-2012 at the MPI for Biological Cybernetics, and from 2009-2010 at the Machine Learning Department, Carnegie Mellon University. Arthur's research interests include machine learning, kernel methods, statistical learning theory, nonparametric hypothesis testing, blind source separation, Gaussian processes, and non-parametric techniques for neural data analysis. He has been an associate editor at IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 to 2013, an Action Editor for JMLR since April 2013, a member of the NIPS Program Committee in 2008 and 2009, an Area Chair for ICML in 2011 and 2012, and a member of the COLT Program Committee in 2013.

Status

  • Workflow Status:Published
  • Created By:Birney Robert
  • Created:12/02/2014
  • Modified By:Fletcher Moore
  • Modified:10/07/2016

Target Audience