This webpage is for an old version of the course; content may be out of date!
CSE 258: Web Mining and Recommender Systems
Autumn 2019, Monday/Wednesday 18:30-19:50, Galbraith Hall
CSE 258 is a graduate course devoted to current methods for recommender systems, data mining, and predictive analytics. No previous background in machine learning is required, but all participants should be comfortable with programming (all example code will be in Python), and with basic optimization and linear algebra.
The course meets twice a week on Monday/Wednesday evenings, starting September 30. Meetings are in Galbraith Hall.
There is no textbook for the course, though chapter references will be provided from Pattern Recognition and Machine Learning (Bishop), and from Charles Elkan's 2013 course notes. Links are also provided to our Coursera Specialization, which covers similar material.
Office hours:
I'll hold office hours on Tuesdays 9:30-13:00 in CSE 4102. The course TAs will hold additional office hours as follows:
- Monday 10:00-12:00: CSE B250A
- Thursday 11:00-12:00: CSE B270A
- Friday 10:30-12:30: CSE B240A
Assessment:
- Homework 1: due Oct 14
- Homework 2: due Oct 28
- Midterm: Nov 6
- Homework 3: due Nov 13
- Assignment 1: due Nov 18
- Homework 4: due Nov 25
- Assignment 2: due Dec 3
Grading:
- Each Homework is worth 8%. Your lowest (of four) homework grades is dropped (or one homework can be skipped).
- The Midterm is worth 26%.
- Each Assignment is worth 25%.
- Assignment 2 is a group assignment. All other assessment must be completed individually.
- All assessments are due before the Monday lecture on the due date. Late submissions are not accepted.
1 | Supervised Learning: Regression |
---|
Monday September 30 / Wednesday October 2:
- Least-squares regression
- Overfitting and regularization
- Training, validation, and testing
Other resources:
Coursera slides (introductory):
Code examples:
2 | Supervised Learning: Classification |
---|
Monday October 7 / Wednesday October 9:
- Logistic regression
- SVMs
- Multiclass and multilabel classification
- How to evaluate classifiers
Other resources:
Coursera slides:
Code examples:
3 | Dimensionality Reduction and Clustering |
---|
Monday October 14 / Wednesday October 16:
- Principal Component Analysis
- K-means & hierarchical clustering
- Community detection
Other resources:
Code examples:
Monday October 21 / Wednesday October 23:
- Collaborative Filtering
- Latent Factor Models
Other resources:
Coursera slides:
Code examples:
Kaggle pages (Assignment 1):
Monday October 28 / Wednesday October 30:
- Sentiment analysis
- Bags-of-words
- TF-IDF
- Stopwords, stemming, and low-dimensional representations of text
Other resources:
Code examples:
No lecture | November 11 (Veteran's Day) |
---|
Wednesday November 13:
- Crawling and parsing data from the Web
- Manipulating time and date data
- Simple plotting with Matplotlib
- General-purpose gradient descent in Tensorflow
Code examples:
8 | Data Mining in Social Networks |
---|
Monday November 18 / Wednesday November 20
- Power-laws and small-worlds
- Random graph models
- Triads and weak ties
- HITS and PageRank
Other resources:
9 | State-of-the-art Recommender Systems |
---|
No lecture | November 27 (Thanksgiving) |
---|
Monday November 25
- State-of-the-art Recommender Systems
- Bayesian Personalized Ranking
- Factorizing Personalized Markov Chains for Next-Basket Recommendation
- Personalized Ranking Metric Embedding for Next New POI Recommendation
- Real-world Applications
- Recommending product sizes to customers
- Playlist prediction via Metric Embedding
10 | Modeling Temporal and Sequence Data |
---|
Monday December 2 / Wednesday December 4
- Sliding windows and autoregression
- Temporal dynamics in recommender systems
- Temporal dynamics in text and social networks
Code examples: