Material Detail

Online Reinforcement Learning from Concurrent Customer Interaction Sequences

Online Reinforcement Learning from Concurrent Customer Interaction Sequences

This video was recorded at Large-scale Online Learning and Decision Making (LSOLDM) Workshop, Cumberland Lodge 2012. This talk explores applications in which a company interacts with many customers. The company has an objective function, such as maximising revenue, customer satisfaction, or customer loyalty, which depends primarily on the sequence of interactions between company and customer. A key aspect ofthis setting is that interactions with different customers occur asynchronously and in parallel. As a result, it is imperative to learn online from partial interaction sequences, so that information acquired from one customer is efficiently assimilated and applied in subsequent interactions with other customers. I will present the first framework for reinforcement learning in this setting, using an asynchronous variant of temporal-difference learning to learn efficiently from partial interaction sequences.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.