I. Nonparametric methods

Nearest neighbor

Decision trees

II. Classification using parametrized models

Generative models: naive Bayes, multivariate Gaussian, Fisher linear discriminant

Discriminative models: logistic regression

More linear classifiers: Perceptron, support vector machines

Kernels

Richer output spaces: multiclass classification and structured output prediction

III. Combining classifiers

Mixtures of experts and multiplicative updates

Boosting, bagging, and random forests

IV. Representation learning

Clustering

Linear projections: PCA and SVD

Embeddings and manifold learning

Metric learning

Autoencoders

Deep nets

Xinan: Mon 7-8 in WLH 2204

Sharad: Wed 7-8 in WLH 2204

Dev: Wed 8-9 in WLH 2204

1. Ability to write simple programs in Python: functions, control structures, string handling, arrays and dictionaries

2. Familiarity with basic probability

3. Familiarity with basic linear algebra

1. Programming exercises should be done in Python. I recommend trying out iPython notebooks.

2. There is no required text for the course. But here are some useful references. The first is available as an e-book through the library website; the rest are on reserve at Geisel:

Trevor Hastie, Robert Tibshirani, and Jerome Friedman, * The elements of statistical learning* (2nd edition).

Gilbert Strang. * Linear algebra and its applications *.

Kevin Murphy, *Machine learning: a probabilistic perspective*.

Richard Duda, Peter Hart, and David Stork, * Pattern classification* (2nd edition).

There will be weekly homeworks, to be turned in (typed and in PDF format) on Gradescope. No late homeworks will be accepted; however, the lowest homework score will be dropped.

Midterms: TBA, in class

Homeworks: 50% (lowest score will be dropped)

Midterms: 25% each