Material Detail
Bandit Algorithms for Online Linear Optimization
This video was recorded at 26th International Conference on Machine Learning (ICML), Montreal 2009. In the online linear optimization problem a forecaster chooses, at each time instance, a vector x from a certain given subset S of the D-dimensional Euclidean space and suffers a time-dependent loss that is linear in x. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible vector in S. In this talk we consider the "bandit" setting of the linear optimization problem, in which the forecaster has only access to the losses of the chosen vectors. We survey some recent algorithms that solve this problem. For the special case in which S is a subset of the d-dimensional Boolean hypercube, we describe a new forecaster whose performance, for a variety of concrete choices of S, is better than all previously known bounds, and not improvable in general. We also point out that computationally efficient implementations for various interesting choices of S exist. Joint work with Gabor Lugosi (Barcelona).
Quality
- User Rating
- Comments
- Learning Exercises
- Bookmark Collections
- Course ePortfolios
- Accessibility Info