• This course schedule and reading list are tentative and might change as the quarter progresses.

  • The URLs of the online forms and the deadlines for submitting the paper reviews are listed alongside each paper; the deadline time is 8:59 AM PST on the deadline date.

  • If no URL is provided, you do not need to submit reviews for that particular paper, but it is still required reading and will be discussed during the lectures.

  • Papers without URLs that are also marked as “(Optional)” are not required readings, but they might come up in the lectures; at least skim reading these would be helpful for you.

  • Please try to attend all the lectures and participate in the discussions.

Lecture Date Topic / Paper Review Form URL Review Deadline
01/08 Introduction and Overview
" Data Management in Machine Learning: Challenges, Techniques, and Systems (Slide deck and Video)
Scalable ML Analytics Frameworks
01/10 Towards a Unified Architecture for in-RDBMS Analytics T1P1 01/10
" The MADlib Analytics Library or MAD Skills, the SQL
01/15 Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing T1P2 01/15
" MLlib: Scalable Machine Learning on Spark
01/17 Scaling Distributed Machine Learning with the Parameter Server T1P3 01/17
" Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud
" Scalability! But at what COST? (Optional)
Systems for Scalable Linear Algebra and Feature Engineering
01/22 SystemML: Declarative machine learning on MapReduce
" SystemML: Declarative Machine Learning on Spark T2P1 01/22
" Towards Linear Algebra over Normalized Data (From questionnaire)
01/24 Materialization Optimizations for Feature Selection Workloads
" To Join or Not to Join? Thinking Twice about Joins before Feature Selection T2P2 01/24
Systems for Model Selection
01/29 Model Selection Management Systems: The Next Frontier of Advanced Analytics T3P1
" Automating Model Search for Large Scale Machine Learning 01/29
Statistical Relational Learning Systems
01/31 Extracting Databases from Dark Data with DeepDive T4P1 01/31
" Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS
02/05 No class
Deep Learning Systems
02/07 Densely Connected Convolutional Networks T5P1 02/07
" Understanding Neural Networks Through Deep Visualization (Video)
" Visualizing and Understanding Recurrent Networks (Optional)
02/12 TensorFlow: A System for Large-Scale Machine Learning (Talk Slides) T5P2 02/12
02/19 TVM: An Automated End-to-End Optimizing Compiler for Deep Learning T5P3 02/19
" Towards Unified Data and Lifecycle Management for Deep Learning
Hardware-conscious ML Systems
02/19 From High-Level Deep Neural Models to FPGAs T6P1 02/19
" Communication-Efficient Learning of Deep Networks from Decentralized Data
Reinforcement Learning Systems
02/21 Ray: A Distributed Framework for Emerging AI Applications T7P1 02/21
" Neural Architecture Search with Reinforcement Learning
Model Serving Systems
02/26 Clipper: A Low-Latency Online Prediction Serving System T8P1 02/26
" MacroBase: Prioritizing Attention in Fast Data
Data Sourcing for ML
02/28 Data Management Challenges in Production Machine Learning (Slide deck) T9P1 02/28
" The Data Linter: Lightweight, Automated Sanity Checking for ML Data Sets
03/05 Snorkel: Rapid Training Data Creation with Weak Supervision T9P2 03/05
" Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets
ML Systems in Production
03/07 Machine Learning: The High-Interest Credit Card of Technical Debt T10P1 03/07
" Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective
" TFX: A TensorFlow-Based Production-Scale Machine Learning Platform (Slide Deck)
03/12 Project Presentations
03/14 Project Presentations