Material Detail

Latent Variable Sparse Bayesian Models

Latent Variable Sparse Bayesian Models

This video was recorded at Workshop on Sparsity in Machine Learning and Statistics, Cumberland Lodge 2009. A variety of practical approaches have recently been introduced for performing estimation and inference using linear models with sparse priors on the unknown coefficients, a process that can have wide-ranging implications in diverse areas such as model selection and compressive sensing. While not always derived or marketed as such, many of these methods can be viewed as arising from Bayesian models capitalizing on latent structure, expressible via hyperparameters, inherent in sparse distributions. Here we focus on four such strategies: (i) standard MAP estimation, (ii) hyperparameter MAP estimation, also called evidence maximization or empirical Bayes, (iii) variational Bayes using a factorial posterior, and (iv) local variational approximation using convex lower bounding. All of these approaches can be used to compute tractable posterior approximations to the underlying full distribution; however, the exact nature of these approximations is frequently unclear and so it is a challenging task to determine which strategy and sparse prior are appropriate. Rather than justifying such selections using the credibility of the full Bayesian model as is sometimes done, we base evaluations on the actual underlying cost functions that emerge from each method. To this end we discuss a common, unifying objective function that encompasses all of the above and then assess its properties with respect to representative applications such as finding maximally sparse (i.e., minimal L0 quasi-norm) representations. This objective function can be expressed in either coefficient space or hyperparameter space, a duality that facilitates direct comparisons between seemingly disparate approaches and naturally leads to theoretical insights and useful optimization strategies such as reweighted L1 and L2 minimization. This perspective also suggests extensions of the sparse linear model, including alternative likelihood functions (e.g., for classification) and more general sparse priors applicable to covariance component estimation, group selection, and the incorporation of explicit coefficient constraints (e.g., non-negativity). Several examples related to neuroimaging and compressive sensing will be considered.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.