Material Detail

Poster: Knowledge as a Constraint on Uncertainty for Unsupervised Classification: A Study in Part-of-Speech Tagging

Poster: Knowledge as a Constraint on Uncertainty for Unsupervised Classification: A Study in Part-of-Speech Tagging

This video was recorded at 25th International Conference on Machine Learning (ICML), Helsinki 2008. This paper evaluates the use of prior knowledge to limit or bias the choices of a classifer during otherwise unsupervised training and classifcation. Focusing on effects in the uncertainty of the model's decisions, we quantify the contributions of the knowledge source as a reduction in the conditional entropy of the label distribution given the input corpus. Allowing us to compare diffrent sets of knowledge without annotated data, we find that label entropy is highly predictive of final performance for a standard Hidden Markov Model (HMM) on the task of part-of-speech tagging. Our results show too that even basic levels of knowledge, integrated as labeling constraints, have considerable effect on classification accuracy, in addition to more stable and effcient training convergence. Finally, for cases where the model's internal classes need to be interpreted and mapped to a de- sired label set, we find that, for constrained models, the requirements for annotated data to make quality assignments are greatly reduced.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.