Material Detail

Query Clustering based on Bid Landscape for Sponsored Search Auction Optimization

This video was recorded at 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Chicago 2013. In sponsored search auctions, the auctioneer operates the marketplace by setting a number of auction parameters such as reserve prices for the task of auction optimization. The auction parameters may be set for each individual keyword, but the optimization problem becomes intractable since the number of keywords is in the millions. To reduce the dimensionality and generalize well, one wishes to cluster keywords or queries into meaningful groups, and set parameters at the keyword-cluster level. For auction optimization, keywords shall be deemed as interchangeable commodities with respect to their valuations from advertisers, represented as bid distributions or landscapes. Clustering keywords for auction optimization shall thus be based on their bid distributions. In this paper we present a formalism of clustering probability distributions, and its application to query clustering where each query is represented as a probability density of click-through rate (CTR) weighted bid and distortion is measured by KL divergence. We first derive a k-means variant for clustering Gaussian densities, which have a closed-form KL divergence. We then develop an algorithm for clustering Gaussian mixture densities, which generalize a single Gaussian and are typically a more realistic parametric assumption for real-world data. The KL divergence between Gaussian mixture densities is no longer analytically tractable; hence we derive a variational EM algorithm that minimizes an upper bound of the total within-cluster KL divergence. The clustering algorithm has been deployed successfully into production, yielding significant improvement in revenue and clicks over the existing production system. While motivated by the specific setting of query clustering, the proposed clustering method is generally applicable to many real-world applications where an example is better characterized by a distribution than a finite-dimensional feature vector in Euclidean space as in the classical k-means.

Keywords:: videolectures, ocwc, oec

Disciplines:

Science and Technology / Computer Science

More...

Go to Material

Bookmark / Add to Course ePortfolio

Create a Learning Exercise

Add Accessibility Information

Rate

Add a Comment

Quality

User Rating
Comments
Learning Exercises
Bookmark Collections
Course ePortfolios
Accessibility Info

Report Broken Link
Report as Inappropriate

More about this material

Material Type:: Presentation
Date Added to MERLOT:: February 10, 2015
Date Modified in MERLOT:: February 10, 2015
Author:: Ye Chen, Microsoft
Submitter:: The Open Education Consortium
Primary Audience:: College General Ed, College Lower Division, College Upper Division
Technical Format:: Video

Mobile Compatibility:: Not specified at this time
Language:: English
Cost Involved:: No
Source Code Available:: No
Creative Commons:: This work is licensed under a Attribution-NonCommercial-NoDerivs 3.0 United States