CSE 291: Topics in unsupervised learning

TuTh 3.30-4.50 in University Center 413A room 1

Sanjoy Dasgupta
Office hours Mon 3-5 in EBU3B 4138

Lectures and readings

Lecture 1: Clustering in metric spaces [4/1]
Gonzalez. Clustering to minimize the maximum intercluster distance.

Lecture 2: The k-means problem [4/3, 4/8]
Arthur and Vassilvitskii. Kmeans++: the advantages of careful seeding.
Kanungo, Mount, Netanyahu, Piatko, Silverman, and Wu. A local search approximation algorithm for k-means clustering.

Lecture 3: The k-medoid problem [4/10]
Lin and Vitter. Approximation algorithms for geometric median problems.

Lecture 4: Hierarchical clustering [4/15]
Hartigan. Clustering algorithms.
Dasgupta and Long. Performance guarantees for hierarchical clustering.

Lecture 5: Finding meaningful clusters in data [4/17, 4/22]
Jardine and Sibson. Mathematical taxonomy.
Kleinberg. An impossibility theorem for clustering.
Balcan, Blum, and Vempala. A discriminative framework for clustering via similarity functions.

Review: Concentration of measure [4/24, 4/29]

Lecture 6: Clustering in an online/streaming setting [5/1, 5/6, 5/8]
Charikar, Chekuri, Feder, and Motwani. Incremental clustering and dynamic information retrieval.
Beygelzimer, Kakade, and Langford. Cover trees for nearest neighbor search.
Guha, Meyerson, Mishra, Motwani, and O'Callaghan. Clustering data streams: theory and practice.
Indyk. Sublinear time algorithms for metric space problems.

Lecture 7: Random projection [5/13]
Dasgupta and Gupta. An elementary proof of the Johnson-Lindenstrauss lemma.

Lecture 8: Spectral methods [5/15, 5/20, 5/22]
Strang. Linear algebra and its applications: appendix A on SVD.
Drineas, Kannan, Frieze, Vempala, and Vinay. Clustering large graphs via the singular value decomposition.

Lecture 9: Multidimensional scaling; Frechet and Bourgain embeddings [5/27, 5/29]
Schoenberg. Metric spaces and positive definite functions.
Kruskal and Wish. Multidimensional scaling.
Matousek. Lectures on discrete geometry, chapter 15: Embedding finite metric spaces into normed spaces.

Lecture 10: Embedding spaces of low intrinsic dimension [6/3, 6/5]
Tenenbaum, de Silva, and Langford. A global geometric framework for nonlinear dimensionality reduction.
Roweis and Saul. Nonlinear dimensionality reduction by locally linear embedding.
Clarkson. Tighter bounds for random projections of manifolds.
Indyk and Naor. Nearest neighbor preserving embeddings.