Material Detail

On the Stratification of Multi-Label Data

On the Stratification of Multi-Label Data

This video was recorded at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Athens 2011. Stratied sampling is a sampling method that takes into account the existence of disjoint groups within a population and produces samples where the proportion of these groups is maintained. In single-label classication tasks, groups are dierentiated based on the value of the target variable. In multi-label learning tasks, however, where there are multiple target variables, it is not clear how stratied sampling could/should be performed. This paper investigates stratication in the multi-label data context. It considers two stratication methods for multi-label data and empirically compares them along with random sampling on a number of datasets and based on a number of evaluation criteria. The results reveal some interesting conclusions with respect to the utility of each method for particular types of multi-label datasets.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Browse...

Disciplines with similar materials as On the Stratification of Multi-Label Data

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.