CSE 250B: Machine Learning

Syllabus

I. Nonparametric methods
        Nearest neighbor
        Decision trees

II. Classification using parametrized models
        Generative models: naive Bayes, multivariate Gaussian, Fisher linear discriminant
        Discriminative models: logistic regression
        More linear classifiers: Perceptron, support vector machines
        Kernels
        Richer output spaces: multiclass classification and structured output prediction

III. Combining classifiers
        Mixtures of experts and multiplicative updates
        Boosting, bagging, and random forests

IV. Representation learning
        Clustering
        Linear projections: PCA and SVD
        Embeddings and manifold learning
        Metric learning
        Autoencoders
        Deep nets

Discussion sections

Xinan: Mon 7-8 in WLH 2204

Sharad: Wed 7-8 in WLH 2204

Dev: Wed 8-9 in WLH 2204

Prerequisites

1. Ability to write simple programs in Python: functions, control structures, string handling, arrays and dictionaries

2. Familiarity with basic probability

3. Familiarity with basic linear algebra

Course materials

1. Programming exercises should be done in Python. I recommend trying out iPython notebooks.

2. There is no required text for the course. But here are some useful references. The first is available as an e-book through the library website; the rest are on reserve at Geisel:
Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The elements of statistical learning (2nd edition).
Gilbert Strang. Linear algebra and its applications .
Kevin Murphy, Machine learning: a probabilistic perspective.
Richard Duda, Peter Hart, and David Stork, Pattern classification (2nd edition).

Homeworks and evaluations

There will be weekly homeworks, to be turned in (typed and in PDF format) on Gradescope. No late homeworks will be accepted; however, the lowest homework score will be dropped.

Midterms: TBA, in class

Grading

Homeworks: 50% (lowest score will be dropped)
Midterms: 25% each