The first few lectures will consist of an overview of some basic material on online learning. From January 24, each class meeting, a student will give a 60 minute presentation on a recent technical paper. The remaining 20 minutes will be spent on questions and discussions. The schedule of paper presentations along with dates is in the table below.
|Jan 3||Administrivia. Introduction to Online Learning||Kamalika|
|Jan 5||The Hedge algorithm and its Analysis||Kamalika|
|Jan 10||Online Convex Optimization. The Projected Gradient Descent Algorithm and its Analysis||Kamalika||Lecture Notes|
|Jan 12||The Online Primal-Dual Algorithm. Started Multiarmed Bandits.||Kamalika|
|Jan 17||No Class. Martin Luther King Day.|
|Jan 19||Stochastic Multiarmed Bandits and the UCB Algorithm. The non-stochastic multiarmed bandit problem. Started the EXP3 Algorithm.||Kamalika||Lecture Notes|
|Jan 24||The EXP3 Algorithm and low expected regrets.||Kamalika|
|Jan 26||The EXP3.P Algorithm and low high probability regrets.||Kamalika|
|Jan 31||Online Learning and Classification. The Mistake Bound Model. Perceptron and Winnow.||Kamalika|
|Feb 2||Online Learning and Generalization.||Kamalika|
|Feb 7||No Class: ITA Workshop. I recommend you attend the statistics and machine learning sessions on Wednesday, Thursday and Friday.|
|Feb 9|| Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient
A. Flaxman, A. Kalai, B. McMahan, SODA 2005.
|Feb 14|| Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization
J. Abernathy, E. Hazan, S. Rakhlin, COLT 2008.
Adaptive subgradient algorithms for online learning and stochastic optimization
J. Duchi, E. Hazan and Y. Singer, COLT 2010.
|Feb 21||No Class. President's day.|
Efficient bandit algorithms for online multiclass prediction
S. Kakade, S. Shalev-Shwartz, and A. Tewari, ICML 2008.
|Feb 28|| The epoch-greedy algorithm for multiarmed bandits with side information
J. Langford and T. Zhang, NIPS 2007.
Tracking the Best Expert
M. Herbster and M. Warmuth, Machine Learning, 1998.
|Ohil K Manyam||Slides|
|Mar 7|| A New Parameter-Free Hedging Algorithm
K. Chaudhuri, Y. Freund, D. Hsu, NIPS 2009.
|Mar 9|| Blackwell's approachability and low
regret learning are equivalent
J. Abernathy, P. Bartlett, E. Hazan, Arxiv, 2010.