DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO
CSE 250B: Principles of Artificial Intelligence:
Learning
Fall 2008
Please ask questions on this message board.
OVERVIEW
CSE 250B is a graduate course devoted to the basic concepts and
algorithms
of supervised and unsupervised learning from
data. 250B is open to Ph.D. and MS students
in computer science, engineering, cognitive science, and all related areas.
Other prospective participants, including undergraduates, should
contact the instructor at elkan@cs.ucsd.edu.
For registration, the section id of CSE 250B is 635253. In Fall 2008, both 250A (taught by Prof.
Lawrence Saul) and 250B will be offered. Students may take one or
both courses: neither is a prerequisite for the other, and there
will be little overlap. Students are also encouraged to attend the AI seminar every Monday.
The specific topics discussed in CSE 250B will include, not necessarily in this order,
- perceptron methods
- classification based on Bayes' rule
- nearest neighbor methods
- logistic regression and log-linear models
- gradient descent training
-
ensemble methods: bagging and boosting
- kernel methods including support vector machines (SVMs)
- performance evaluation: precision, recall, cross-validation
-
the problem of overfitting, Occam's razor, and regularization
-
making optimal decisions given costs and probabilities
- unsupervised learning and clustering
- generative models, especially multivariate Gaussians
- training via expectation-maximization (EM)
- dimensionality reduction: principal component analysis
- learning to predict structured outputs, especially sequences
Two important topics that will not be covered are graphical models and reinforcement learning. The instructor is Charles
Elkan, Professor.
Office hours will be announced, in the CSE building, room 4134. If
you are unable to attend office hours, feel
free to send email to arrange an
appointment.
Some topics discussed in class will not be in any textbook,
and many will be explained differently, so coming to lectures and taking notes
carefully is important. Examinations will be
based mainly on the online lecture notes.
LECTURES
Lectures will be on Tuesdays and Thursdays from 2pm to 3:20pm in the
Warren Lecture Hall building, room 2206. For lecture notes from the Fall 2007 version of 250B,
see http://www.cs.ucsd.edu/users/elkan/250Bfall2007. The first lecture will be on Thursday September 25.
September 25
|
Nearest-neighbor classification. Lower bounds for the Bayes error rate.
|
Project 1
|
| September 30 |
Linear classifiers. Equations for Euclidean hyperplanes. |
|
| October 2 |
Perceptron algorithm, theorem about perceptron convergence. |
|
| October 7 |
Proof of perceptron convergence. Voted and averaged perceptrons. Maximum likelihood (ML). |
|
| October 9 |
Max likelihood for a Bernoulli distribution, conditional max likelihood, logistic regression. |
Project 2 |
| October 14 |
Stochastic gradient ascent for logistic regression training. |
|
| October 16 |
Making optimal decisions, evaluating classifiers. |
Hints for projects |
| October 21 |
Log-linear models, feature functions. |
Project 1 grading form |
| October 23 |
Conditional random fields (CRFs). |
|
| October 28 |
In-class midterm exam |
|
| October 30 |
Midterm discussion. Viterbi and matrix-multiplication algorithms for CRFs. |
Project 3 |
| November 4 |
Review of CRF algorithms, perceptron training for CRFs. |
|
| November 6 |
Bag-of-words representation for documents. Multinomial model. |
|
| November 11 |
No lecture because of Veteran's Day. |
Midterm solutions |
| November 13 |
Mixture models, expectation-maximization (EM), deterministic annealing. |
|
| November 18 |
Proof of EM correctness, variants of EM. |
|
| November 20 |
Topic models and latent Dirichlet allocation (LDA). |
Project 4 |
| November 25 |
|
|
| November 27 |
No lecture because of Thanksgiving. |
|
TEXTBOOKS
The course will not be based on any single book. The
following textbooks are recommended as references:
For a price comparison among web booksellers use addall.com
with the ISBN numbers.
Some topics discussed in class will not be in any textbook,
and many will be explained differently, so coming to lectures and taking notes
carefully is important. Examinations will be
based mainly on the online lecture notes.
ASSIGNMENTS AND GRADING
There will be one in-class midterm exam (10% of your overall grade), a
final examination (30%), and four project assignments (15% each).
You should do each project with one partner, so individual
work will count for 40% of your grade and joint work for 60%. You
are free to change partners, or not, between projects.
Each project will last between two and three weeks and will
require coding, experimenting with data, and writing a report.
Using a high-level environment such as Matlab or R
is encouraged. Projects will be graded based exclusively on the written
report. Each pair of partners should hand in their joint report
at the start of class on the day that the report
is due. Each day that a report is late will cost 20% of the
maximum
score available for the project. Reports will be evaluated
using grading criteria similar to those in this form. Complete academic honesty is always
required.
The due dates for the four projects will be Thursday
October 9,
Thursday October 23, Tuesday November 18, and Thursday December 4.
The midterm will be in class on Tuesday October 28 and the
final exam will be day and room to be
announced. The last lecture will be on Thursday December 4.
There is no a priori correspondence between
letter grades and numerical scores on the assignments or on the exam.
You can evaluate your performance in the class by comparing your scores
with the means and standard deviations, which will be announced.
However there is also no fixed correspondence between letter grades and
standard deviations above or below the mean. If all students do well
in the absolute, then all students will get a good grade.
You should not drop CSE 250B just because you are unhappy with the score
that you receive on a project. Instead, you should make an appointment
to discuss with the instructor how you can do better on following projects.
Most recently updated on Novmber 20, 2008 by Charles Elkan, elkan@cs.ucsd.edu.