Material Detail
On a L1-Test Statistic of Homogeneity
This video was recorded at NIPS Workshop on Representations and Inference on Probability Distributions, Whistler 2007. My presentation will be divided in two parts. First, I will present two simple and explicit procedures for testing homogeneity of two independent multivariate samples of size $n$. The nonparametric tests are based on the statistic $T_n$, which is the $L_1$ distance between the two empirical distributions restricted to a finite partition. Both tests reject the null hypothesis of homogeneity if $T_n$ becomes large, i.e., if $T_n$ exceeds a threshold. I will first discuss Chernoff-type large deviation properties of $T_n$. This results in a distribution-free strong consistent test of homogeneity. Then the asymptotic null distribution of the test statistic is obtained, leading to an asymptotically $\alpha$-level test procedure. In the second part, I will consider the problem of selecting an unknown multivariate density $f$ belonging to a set of densities ${\cal F}_{k^*}$ of finite associated Vapnik-Chervonenkis dimension, where the complexity $k^*$ is unknown, and $\mathcal F_k \subset \mathcal F_{k+1}$ for all $k$. Given an i.i.d. sample of size $n$ drawn from $f$, I will show how the statistic $T_n$ can be used to build an estimate $\hat f_{K_n}$ yielding almost sure convergence of the estimated complexity $K_n$ to the true but unknown $k^*$, and with the property $\mathbf E \{\int|\hat f_{K_n}-f|\}=\mbox{O}(1/\sqrt{n})$. The methodology includes a wide range of density models, such as mixture models and exponential families. This talk is a summary of two papers written with B. Cadre (ENS Cachan, France), L. Devroye (McGill University, Montreal) and L. Gyorfi (Technical University, Budapest)
Quality
- User Rating
- Comments
- Learning Exercises
- Bookmark Collections
- Course ePortfolios
- Accessibility Info