CSE 291 Winter 2011: Topics in Online Learning and Bandit Problems

Time and Venue:

Lectures: MW 5:00-6:20 pm, Center 204
Office Hours: Thursdays, 3-5pm, in EBU3B 4234


Kamalika Chaudhuri
Email: kamalika at cs dot ucsd dot edu


CSE 291 is a graduate seminar class, and the focus of this quarter is Online Learning and Bandit Problems. A prerequisite for this class is previous background on either machine learning or graduate-level algorithms. This is a four unit class, and the work consists of oral presentations.

The first few lectures will consist of an overview of some basic material on online learning. From January 24, each class meeting, a student will give a 60 minute presentation on a recent technical paper. The remaining 20 minutes will be spent on questions and discussions. The schedule of paper presentations along with dates is in the table below.

Presentation Procedure:

The deadlines below are all hard deadlines.

Lecture Schedule:

Date Topic Speaker Slides/Notes
Jan 3 Administrivia. Introduction to Online Learning Kamalika
Jan 5 The Hedge algorithm and its Analysis Kamalika
Jan 10 Online Convex Optimization. The Projected Gradient Descent Algorithm and its Analysis Kamalika Lecture Notes
Jan 12 The Online Primal-Dual Algorithm. Started Multiarmed Bandits. Kamalika
Jan 17 No Class. Martin Luther King Day.
Jan 19 Stochastic Multiarmed Bandits and the UCB Algorithm. The non-stochastic multiarmed bandit problem. Started the EXP3 Algorithm. Kamalika Lecture Notes
Jan 24 The EXP3 Algorithm and low expected regrets. Kamalika
Jan 26 The EXP3.P Algorithm and low high probability regrets. Kamalika
Jan 31 Online Learning and Classification. The Mistake Bound Model. Perceptron and Winnow. Kamalika
Feb 2 Online Learning and Generalization. Kamalika
Feb 7 No Class: ITA Workshop. I recommend you attend the statistics and machine learning sessions on Wednesday, Thursday and Friday.
Feb 9 Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient
A. Flaxman, A. Kalai, B. McMahan, SODA 2005.
Avinash Atreya Slides
Feb 14 Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization
J. Abernathy, E. Hazan, S. Rakhlin, COLT 2008.
Akshay Balsubramani Slides
Feb 16 Adaptive subgradient algorithms for online learning and stochastic optimization
J. Duchi, E. Hazan and Y. Singer, COLT 2010.
Vicente Malave Slides
Feb 21 No Class. President's day.
Feb 23 Efficient bandit algorithms for online multiclass prediction
S. Kakade, S. Shalev-Shwartz, and A. Tewari, ICML 2008.
Nakul Verma Slides
Feb 28 The epoch-greedy algorithm for multiarmed bandits with side information
J. Langford and T. Zhang, NIPS 2007.
Terry Lam Slides
Mar 2 Tracking the Best Expert
M. Herbster and M. Warmuth, Machine Learning, 1998.
Ohil K Manyam Slides
Mar 7 A New Parameter-Free Hedging Algorithm
K. Chaudhuri, Y. Freund, D. Hsu, NIPS 2009.
Gopi Tummala Slides
Mar 9 Blackwell's approachability and low regret learning are equivalent
J. Abernathy, P. Bartlett, E. Hazan, Arxiv, 2010.
Matus Telgarsky

Resources on Online Learning

Presentation Resources

Here are some guidelines on how to give a presentation, thanks to Prof. Charles Elkan .