CSE 291D. Unsupervised Learning

Assignments Syllabus Readings

Spring 2011
Professor: Lawrence Saul
Lectures: Tue/Thu 11:00 am - 12:20 pm
Location: CogSci Building, Room 4
Office hours: after class and/or by appointment.
Units: 4

Course Description

The lectures in this course will survey leading algorithms for unsupervised learning and high dimensional data analysis. The first part of the course will cover probabilistic/generative models of high dimensional data, such as Gaussian mixture models, factor analysis, nonnegative matrix factorization, exponential family PCA, probabilistic latent semantic analysis, latent Dirichlet allocation, independent component analysis, and deep neural networks. The second part of the course will cover spectral methods for dimensionality reduction, including multidimensional scaling, Isomap, maximum variance unfolding, locally linear embedding, graph Laplacian methods, spectral clustering, and kernel PCA.

Prerequisites

The course is aimed at graduate students in machine learning and related fields. Students should have earned a high grade in a previous, related course, such as CSE 250A, CSE 250B, ECE 271A, or ECE 271B. The course will be taught by lecture in the same style as CSE 250A, though at a more advanced mathematical level. Enrollment is by permission of the instructor.

Grading

There will be three homework assignments (60-75%) and a final course project (25-40%). Students may also be required to submit "scribe notes" (handwritten or typeset) on a small subset of lectures.

Tentative Syllabus

Tue Mar 29 Course overview. Review of clustering: k-means algorithm, Gaussian mixture modeling.
Thu Mar 31 Review of linear dimensionality reduction: principal component analysis, factor analysis.
Tue Apr 05 EM algorithms for factor analysis, principal component analysis, and mixture of factor analyzers.
Thu Apr 07 Nonnegative matrix factorization: cost functions and multiplicative updates.
Tue Apr 09 Nonnegative matrix factorization: auxiliary functions and proofs of convergence.
Thu Apr 14 Exponential family PCA.
Tue Apr 19 Singular value decomposition, low-rank matrix approximations, multidimensional scaling.
Thu Apr 21 Manifold learning, Isomap algorithm.
Tue Apr 26 Nystrom approximation; maximum variance unfolding (MVU).
Thu Apr 28 Spectral clustering, normalized cuts, graph partitioning.
Tue May 03 Laplacian eigenmaps, locally linear embedding (LLE).
Thu May 05 Low rank factorizations for MVU, kernel PCA, class evaluations.
Tue May 10 Document modeling: bag-of-words representation, probabilistic latent semantic indexing, Dirichlet models.
Thu May 12 Latent Dirichlet allocation.
Tue May 17 Variational approximations for inference.
Thu May 19 Independent component analysis: maximum likelihood, contrast functions.
Tue May 24 Fixed point methods; blind source separation.
Thu May 26 Student presentations: Andrew Gross, Edward O'Brien and Chris DeBoever, Rohan Anil, Moahammed Saberian.
Tue May 31 Student presentations: Vineet Kumar, Baris Aksanli, Samyeul Noh and Sunghee Woo, Katherine Ellis, Daryl Lim, He Huang.
Thu Jun 02 Student presentations: Matt Der, Vivek Ramavajjala, Akshay Balsubramani, Ashish Venkat, Elkin Dario Gutierrez.

Readings

Probabilistic PCA

Nonnegative matrix factorization

Exponential family PCA

Document modeling

Independent component analysis

Deep architectures

Multidimensional scaling and Nystrom approximation

Isomap and extensions

Maximum variance unfolding

Spectral clustering

Graph Laplacian methods

Locally linear embedding and related work

Kernel PCA