Material Detail
Robust PCA and Collaborative Filtering: Rejecting Outliers, Identifying Manipulators
This video was recorded at NIPS Workshops, Whistler 2010. Principal Component Analysis is one of the most widely used techniques for dimensionality reduction. Nevertheless, it is plagued by sensitivity to outliers; finding robust analogs, particularly for high-dimensional data, is critical. We discuss the challenges posed by the high dimensional setting, where dimensionality is of the same order, or greater, than the number of samples. We detail why existing techniques fail -- indeed, no known algorithm can provide provable bounds to any constant fraction of outliers -- and then present two very different algorithms for High Dimensional Robust PCA. Our first algorithm achieves a breakdown point of 50% -- the best possible using any algorithm, and a stark improvement from the previous best-known result of 0%. Our second algorithm is based on ideas from convex optimization, and in addition to recovering the principal components, is also able to identify the corrupted points. We extend this to the partially observed setting, significantly extending matrix completion results to the setting of corrupted rows or columns.
Quality
- User Rating
- Comments
- Learning Exercises
- Bookmark Collections
- Course ePortfolios
- Accessibility Info