Material Detail

Stumping along a summary

Stumping along a summary

This video was recorded at Workshop on On‐lineTrading of Exploration and Exploitation 2, Washington 2011. The methods we used to compete in the « Exploration & Exploitation » challenge are based on three layers. The first layer provides an online summary of the data stream for continuous and nominal data. Continuous data are handled using the Greenwald and Khanna online quantile summary which provides error guarantees for a fixed memory size. Nominal data are summarized with a hash-based counting structure. With these techniques we managed to build an accurate stream summary with a small memory footprint. The second layer uses the summary to build predictors. We explored several kinds of trees from simple decision stumps to deep multivariate ones. The stumps proved to be remarkably stable and efficient. But on the other hand, a progressive unfolding of the trees seemed to improve the model on the long run. For the last layer, we explored several combination strategies: online bagging, exponential weighting, linear ranker, etc. We observed a tradeoff between the expressiveness of the predictors and the power of the combination strategy but most strategies being difficult to tune, we went back to a simple averaging. It seems, from our experiments, that both the need for exploration and the click scarcity sharpens the need for very stable models.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.