DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO
CSE 254: Seminar on Learning Algorithms
Spring 2005
CSE 254 is a graduate seminar devoted to recent research on AI
learning methods and applications. This is not an
introductory course, so the prerequisite
is at least one graduate-level course (at UCSD or elsewhere) in machine
learning or a closely related area such as statistics or pattern
recognition. Appropriate courses at UCSD include CSE 291
(Statistical Learning), CSE 253 (Neural Networks for Pattern
Recognition), and Cognitive
Science 260 (Pattern Recognition).
The room for CSE 254 is Center Hall 201. The class meets on
Tuesdays
and Thursdays from 12:30 to 1:50. The first meeting will be on
Tuesday March 29.
In each class meeting, a student will give a talk lasting about 60
minutes presenting a recent technical paper in detail. In
questions during the talk, and in the final 20 minutes, all seminar
participants will discuss the paper and the issues raised by it.
| date |
presenter |
paper
|
author(s)
|
slides
|
| March 29 |
organizational meeting |
|
|
|
March 31
|
discussion
of project guidelines
|
|
|
|
April 5
|
discussion
on technical writing |
Clear and Simple
as the Truth (extracts) |
Turner, Thomas |
|
April 7
|
Daniel Hsu
|
Experiments
with Random Projections for Machine Learning
|
Fradkin, Madigan
|
here
|
April 12
|
Shankar Shivappa
|
Theoretical
Views
of Boosting and Applications |
Schapire
|
here
|
April 14
|
Paul Hammon
|
Feature
selection, L1 vs. L2 regularization, and rotational invariance |
Ng
|
here
|
April 19
|
Evan Ettinger
|
Grouping and dimensionality reduction by
locally
linear embedding |
Perona, Polito
|
here
|
April 21
|
Charles Elkan
|
A new probabilistic model for
documents
|
|
|
April 26
|
Shankar
Shivappa
|
Recognition
of Visual Speech Elements Using Adaptively Boosted Hidden Markov Models
|
Foo, Lian, Dong
|
here
|
April 28
|
Jan Voung |
Online and Batch
Learning of Pseudo-Metrics
|
Shalev-Shwartz, Singer, Ng |
here
|
May 3
|
Nakul Verma
|
Algorithms
for Large Scale Markov Blanket Discovery
HITON,
A Novel Markov Blanket Algorithm for Optimal Variable Selection
|
Tsamardinos,
Aliferis, Statnikov |
here
|
May 5
|
Mohsen Azarbayejani |
Improving
Text Classification by Shrinkage in a Hierarchy of Classes |
McCallum, Rosenfeld, Mitchell,
Ng |
here
|
May 10
|
Jan
Voung
|
Efficient
Exact k-NN and Nonparametric Classification in High Dimensions |
Liu, Moore, Gray
|
here
|
May 12
|
Evan
Ettinger
|
Global versus
local methods in nonlinear dimensionality reduction |
de Silva, Tenenbaum
|
here
|
May 17
|
Charles Elkan
|
Yet another new probabilistic
model for documents |
|
|
May 19
|
Mohsen
Azarbayejani |
A
Hierarchical Model for Clustering and Categorising Documents |
Gaussier, Goutte, Popat, Chen |
|
May 24
|
Doug Turnbull
|
Automatic music annotation
|
Turnbull
|
here
|
May 26
|
Shankar Shivappa
|
Exploiting
generative models in discriminative classifiers
An
Information-Geometric Approach to Document Retrieval and Categorization
|
Jaakkola, Haussler
Hofmann
|
|
May 31
|
Daniel Hsu
|
A random walks
perspective on maximizing satisfaction and profit
|
Matthew Brand
|
here
|
June 2
|
|
Project presentations
|
|
|
Note: The seminar will run in parallel with a data mining contest sponsored by Fair Isaac, with cash
prizes.
Each student will do one term project following specific guidelines.
The project should be at the frontier of current research, and
preferably closely inspired by at least one of the papers discussed in
the class. Project reports will be evaluated using these grading
criteria. There is a schedule for handing in a detailed
project proposal, a draft project report, and then the final report.
The seminar will have no final exam. Letter grades will be
based mostly on the final project report, but the presentations,
participation in class and in the web-based discussions, and the
intermediate project deliverables are all important also.
The instructor is Charles
Elkan, Professor, whose office is AP&M room
4856. Feel free to send email
to arrange an appointment, or telephone (858) 534-8897.
REGISTRATION
Students may take the seminar for a letter grade for four units, or for
one or two units S/U:
- For one unit, a student will present one research paper and
participate in all class meetings.
- For two units, a student will make two presentations.
- Four units will require a presentation, participation in all
class activities, and a project.
For four units, a student should register for CSE 254, section id
527966 for a letter grade. For
one or two units, a student should
register for the instructor's CSE 293, section id 527993.
Students who took a previous version of CSE 254 may
take
it again. Papers will be different this year.
PAPERS AND TOPICS
In the first week, we will make a schedule of papers and presentations
for the whole quarter. Papers will be recent technical articles,
often from NIPS and ICML. Each paper will be made available on
the
web as the quarter progresses. Students will choose papers
in consultation with the instructor. Relevant topics may include:
- supervised learning with many classes
- regularization when observed counts are zero or small
- discriminative versus generative modeling of data
- semi-supervised learning from labeled and unlabeled data
- transductive inference
- new boosting algorithms
- feature selection from very large feature spaces
- modeling human heuristics for learning
- reinforcement learning algorithms and applications
- applications to text categorization
- applications to image retrieval
- applications in computational biology
- financial applications
Some papers will be theoretical, and some will be applied. Each
presentation will cover a single conference paper, to ensure that it is
explained and discussed in sufficient depth.
Various textbooks are useful as background reading. including
Students are encouraged to use other books and papers also.
PRESENTATIONS
The procedure for each student presentation is as follows:
- One week in advance: Finish a draft of about 40 slides that
present clearly the work in the paper. Make an appointment with
the instructor to discuss the draft slides. Email the slides to elkan@cs.ucsd.edu.
- Several days in advance: Meet for about one hour to discuss
improving the slides, and how to give a good presentation.
- Day of presentation: Give a good presentation with confidence,
enthusiasm, and clarity.
- Less than three days afterwards: Make changes to the slides
suggested by the class discussion, and email the slides in PDF, two
slides per page, to the instructor for publishing. Try to make
your PDF file less than one megabyte.
Please read, reflect upon, and follow these presentation
guidelines. Presentations will be evaluated, in a friendly
way but with high standards, using this feedback
form.
Each presentation should be prepared using LaTeX or
Powerpoint, and should consist of about 40 slides. You must copy
all important equations, diagrams, charts, and tables from the paper
into your slides.
For each paper, we will have a web-based discussion area. Each
student is expected to contribute at least one message to the
discussion, before the presentation. A message may ask an
interesting question, point out a strength or weakness of the paper, or
answer a question asked by someone else. Messages should be
thoughtful!
The schedule of presentations will be determined as much as possible
on Tuesday March 29. Students should choose a date first, and
then
agree with the instructor about a paper to present. To find
ideas,
students can look at this list of
possible papers and contact the instructor.
If you want to change your presentation date, please arrange a swap
with another student and notify the instructor at least two weeks in
advance.
Most recently updated on May 31, 2005 by Charles Elkan, elkan@cs.ucsd.edu