Material Detail

BErMin: A Model Selection Algorithm for Reinforcement Learning Problems

BErMin: A Model Selection Algorithm for Reinforcement Learning Problems

This video was recorded at NIPS Workshops, Sierra Nevada 2011. We consider the problem of model selection in the batch (offline, non-interactive) reinforcement learning setting when the goal is to find an action-value function with the smallest Bellman error among a countable set of candidate functions. We propose a complexity regularization-based model selection algorithm, BErMin, and prove that it enjoys an oracle-like property: the estimator's error differs from that of an oracle, who selects the candidate with the minimum Bellman error, by only a constant factor and a small remainder term that vanishes at a parametric rate as the number of samples increases.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.