Material Detail

Learning When to Stop Thinking and Do Something!

Learning When to Stop Thinking and Do Something!

This video was recorded at 26th International Conference on Machine Learning (ICML), Montreal 2009. An anytime algorithm is capable of returning a response to the given task at essentially any time; typically the quality of the response improves as the time increases. Here, we consider the challenge of learning when we should terminate such algorithms on each of a sequence of iid tasks, to optimize the expected average reward per unit time. We provide an algorithm for answering this question. We combine the global optimizer Cross Entropy method and the local gradient ascent, and theoretically investigate how far the estimated gradient is from the true gradient. We empirically demonstrate the applicability of the proposed algorithm on a toy problem, as well as on a real-world face detection task.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.