Material Detail

Machine Learning in Health Informatics: Making Better use of Domain Experts

This video was recorded at 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Chicago 2013. We present novel machine learning and data mining methods that make real-world learning systems more eﬃcient. We focus on the domain of clinical informatics, an archetypical example of a ﬁeld overwhelmed with information. Due to properties inherent to clinical informatics tasks – and indeed, to many tasks that require specialized domain knowledge – 'oﬀ-the-shelf' machine learning technologies generally perform poorly in this domain. If machine learning is to be successful in clinical science, novel methods must be developed to: mitigate the eﬀects of class imbalance during model induction; exploit the wealth of domain knowledge highly skilled domain experts bring to the task; and to induce better models with less eﬀort (fewer labels). We present new machine learning methods that address each of these issues, and demonstrate their eﬃcacy in the task of abstract screening. In particular, we develop new theoretical perspectives on class imbalance, novel methods for exploiting dual supervision (i.e., labels on both instances and features), and new active learning techniques that address issues inherent to real-world applications (e.g., exploiting multiple experts in tandem). Each of these contributions aims to squeeze better classiﬁcation performance out of fewer labels, thereby making better use of domain experts' time and expertise. The immediate aim in this work is to reduce the workload involved in conducting systematic reviews, and to this end we demonstrate that the developed methods can reduce reviewer workload by more than half, without sacriﬁcing the comprehensiveness of reviews (i.e., without missing any relevant published evidence). But this is only an exemplary task; the approaches presented here have wider application to many real-world learning problems, i.e., those that require specialized expertise, exhibit class imbalance (and asymmetric costs) and for which limited human resources are available. We show that the methods we have developed bring substantial improvements over previously existing machine learning approaches in terms of inducing better models with less eﬀort.

Keywords:: videolectures, ocwc, oec

Disciplines:

Science and Technology / Computer Science

More...

Go to Material

Bookmark / Add to Course ePortfolio

Create a Learning Exercise

Add Accessibility Information

Rate

Add a Comment

Quality

User Rating
Comments
Learning Exercises
Bookmark Collections
Course ePortfolios
Accessibility Info

Report Broken Link
Report as Inappropriate

More about this material

Material Type:: Presentation
Date Added to MERLOT:: February 10, 2015
Date Modified in MERLOT:: February 10, 2015
Author:: Byron C. Wallace, Brown Laboratory for Linguistic Information Processing, Brown University
Submitter:: The Open Education Consortium
Primary Audience:: College General Ed, College Lower Division, College Upper Division
Technical Format:: Video

Mobile Compatibility:: Not specified at this time
Language:: English
Cost Involved:: No
Source Code Available:: No
Creative Commons:: This work is licensed under a Attribution-NonCommercial-NoDerivs 3.0 United States