CSE 158 is an undergraduate course devoted to current methods for recommender systems, data mining, and predictive analytics. No previous background in machine learning is required, but all participants should be comfortable with programming (all example code will be in Python), and with basic optimization and linear algebra.

The course meets twice a week on Monday/Wednesday evenings, starting September 30. Meetings are in Galbraith Hall.

There is no textbook for the course, though chapter references will be provided from Pattern Recognition and Machine Learning (Bishop), and from Charles Elkan's 2013 course notes. Links are also provided to our Coursera Specialization, which covers similar material.

## Basic Info |

I'll hold office hours on **Tuesdays 9:30-13:00** in CSE 4102. The course TAs will hold additional office hours as follows:

**Monday**15:00-17:00: CSE B270A**Tuesday**14:00-15:00: CSE B215**Tuesday**15:00-16:00: CSE B270A**Friday**11:00-13:00: CSE B215

**Homework 1:**due Oct 14**Homework 2:**due Oct 28**Midterm:**Nov 6**Homework 3:**due Nov 11**Assignment 1:**due Nov 18**Homework 4:**due Nov 25**Assignment 2:**due Dec 2

- Each
**Homework**is worth 8%. Your lowest (of four) homework grades is dropped (or one homework can be skipped). - The
**Midterm**is worth 26%. - Each
**Assignment**is worth 25%. **Assignment 2**is a**group assignment**. All other assessment must be completed individually.- All assessments are due
**before**the Monday lecture on the due date. Late submissions are not accepted.

piazza page |

gradescope page |

last year's course webpage |

course outline |

1 | ## Supervised Learning: Regression |
---|

- Least-squares regression
- Overfitting and regularization
- Training, validation, and testing

- Bishop ch.3
- Elkan ch.3,6
- Instructions to access videos on coursera

- CSV and JSON files
- Reading CSV and JSON into Python
- Processing structured data in Python
- Extracting simple statistics from datasets
- Data filtering and cleaning
- Text and string processing in Python
- Time and date data
- Matrix processing and numpy
- Regression in Python
- Features from categorical data
- Features from temporal data
- Feature transformations
- Missing values
- Motivation behind the MSE
- Over and underfitting
- Setting up a codebase for evaluation and validation
- Evaluating a regularized model
- Evaluating classifiers for ranking
- Introduction to Training and Testing
- Validation
- Implementing a regularization pipeline in Python
- Guidelines on the implementation of predictive pipelines

- Workbook 1: CSV/TSV/JSON; extracting simple statistics; pandas; plotting
- Notebook from lecture

Files | week1.py | 50k beer reviews | non-alcoholic beer reviews |
---|

Lecture 1 | slides | + annotations | podcast |
---|

Lecture 2 | slides | + annotations | podcast |
---|

Homework | Homework 1 (due October 14) |
---|

2 | ## Supervised Learning: Classification |
---|

- Logistic regression
- SVMs
- Multiclass and multilabel classification
- How to evaluate classifiers

- Bishop ch.4
- Elkan ch.5,8
- More detailed derivation of the SVM (2018)
- Case study: reddit popularity

- Workbook 2: Classification; diagnostics; training/testing; gradient descent
- Notebook from lecture

Files | week2.py | 50k book descriptions | 5k book cover images |
---|

Lecture 3 | slides | + annotations | podcast |
---|

Lecture 4 | slides | + annotations | podcast |
---|

3 | ## Dimensionality Reduction and Clustering |
---|

- Principle Component Analysis
- K-means & hierarchical clustering
- Community detection

- Bishop ch.9
- Elkan ch.13
- More detailed derivation of PCA (2018)

Files | week3.py | facebook ego network |
---|

Lecture 5 | slides | + annotations | podcast |
---|

Lecture 6 | slides | + annotations | podcast |
---|

Homework | Homework 2 (due October 28) |
---|

4 | ## Recommender Systems |
---|

Assignment | Assignment 1 (due November 18) |
---|

5 | ## Text Mining |
---|

Homework | Homework 3 (due November 11) |
---|

6 | ## Midterm |
---|

Midterm prep | Nov 4 |
---|

Midterm | Nov 6 |
---|

sp15 midterm (CSE190) | Solutions | Solution video |

fa15 midterm (CSE190) | Solutions | Solution video |

fa15 midterm (CSE255) | Solutions | Solution video |

wi17 midterm (CSE158) | Solutions | Solution video |

wi17 midterm (CSE258) | Solutions | Solution video |

fa17 midterm (CSE158) | Solutions | Solution video |

fa17 midterm (CSE258) | Solutions | Solution video |

fa18 midterm (CSE158) |

fa18 midterm (CSE258) |

Assignment | Assignment 2 (due December 2) |
---|

7-10 | ## TBD |
---|

Homework | Homework 4 (due November 25) |
---|

No lecture | November 11 (Veteran's Day) |
---|

No lecture | November 27 (Thanksgiving) |
---|