Material Detail

Machine Learning in Health Informatics: Making Better use of Domain Experts

Machine Learning in Health Informatics: Making Better use of Domain Experts

This video was recorded at 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Chicago 2013. We present novel machine learning and data mining methods that make real-world learning systems more efficient. We focus on the domain of clinical informatics, an archetypical example of a field overwhelmed with information. Due to properties inherent to clinical informatics tasks – and indeed, to many tasks that require specialized domain knowledge – 'off-the-shelf' machine learning technologies generally perform poorly in this domain. If machine learning is to be successful in clinical science, novel methods must be developed to: mitigate the effects of class imbalance during model induction; exploit the wealth of domain knowledge highly skilled domain experts bring to the task; and to induce better models with less effort (fewer labels). We present new machine learning methods that address each of these issues, and demonstrate their efficacy in the task of abstract screening. In particular, we develop new theoretical perspectives on class imbalance, novel methods for exploiting dual supervision (i.e., labels on both instances and features), and new active learning techniques that address issues inherent to real-world applications (e.g., exploiting multiple experts in tandem). Each of these contributions aims to squeeze better classification performance out of fewer labels, thereby making better use of domain experts' time and expertise. The immediate aim in this work is to reduce the workload involved in conducting systematic reviews, and to this end we demonstrate that the developed methods can reduce reviewer workload by more than half, without sacrificing the comprehensiveness of reviews (i.e., without missing any relevant published evidence). But this is only an exemplary task; the approaches presented here have wider application to many real-world learning problems, i.e., those that require specialized expertise, exhibit class imbalance (and asymmetric costs) and for which limited human resources are available. We show that the methods we have developed bring substantial improvements over previously existing machine learning approaches in terms of inducing better models with less effort.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.