• This course schedule and reading list are tentative and might change as the quarter progresses.

  • The URLs of the online forms and the deadlines for submitting the paper reviews are listed alongside each paper; the deadline time is 11:59 PM PST on the deadline date.

  • If no URL is provided, you do not need to submit reviews for that particular paper, but it is still required reading and will be discussed during the lectures.

  • Papers without URLs that are also marked as “(Optional)” are not required readings, but they might come up in the lectures; at least skim reading these would be helpful for you.

  • Please try to attend all the lectures and participate in the discussions.

Lecture Date Topic / Paper Review Form URL Review Deadline
01/09 Introduction and Overview
" Data Management in Machine Learning: Challenges, Techniques, and Systems (Slide deck and Video)
" Data Management Challenges in Production Machine Learning (Slide deck)
Scalable ML Analytics Frameworks
01/11 Towards a Unified Architecture for in-RDBMS Analytics T1P1 01/10
" The MADlib Analytics Library or MAD Skills, the SQL
01/16 Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing T1P2 01/15
" MLlib: Scalable Machine Learning on Spark
01/18 Scaling Distributed Machine Learning with the Parameter Server T1P3 01/17
" Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud
" Scalability! But at what COST? (Optional)
Systems for Scalable Linear Algebra and Feature Engineering
01/23 SystemML: Declarative machine learning on MapReduce
" SystemML: Declarative Machine Learning on Spark T2P1 01/22
" Towards Linear Algebra over Normalized Data (From questionnaire)
01/25 Materialization Optimizations for Feature Selection Workloads
" To Join or Not to Join? Thinking Twice about Joins before Feature Selection T2P2 01/24
Systems for Model Selection
01/30 Model Selection Management Systems: The Next Frontier of Advanced Analytics
" Automating Model Search for Large Scale Machine Learning T3P1 01/29
Statistical Relational Learning Systems
02/01 Extracting Databases from Dark Data with DeepDive T4P1 01/31
" Incremental Knowledge Base Construction Using DeepDive
" Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS (Optional)
Deep Learning Systems
02/06 Densely Connected Convolutional Networks T5P1 02/05
" Understanding Neural Networks Through Deep Visualization (Video)
02/08 No class
02/13 Sequence to Sequence Learning with Neural Networks
" Neural Turing Machines T5P2 02/12
" Visualizing and Understanding Recurrent Networks (Optional)
02/15 TensorFlow: A System for Large-Scale Machine Learning (Talk Slides) T5P3 02/14
" Towards Unified Data and Lifecycle Management for Deep Learning
Reinforcement Learning Systems
02/20 Ray: A Distributed Framework for Emerging AI Applications T6P1 02/19
" Neural Architecture Search with Reinforcement Learning
Hardware-conscious ML Systems
02/22 From High-Level Deep Neural Models to FPGAs T7P1 02/21
" Communication-Efficient Learning of Deep Networks from Decentralized Data
Model Serving Systems
02/27 Clipper: A Low-Latency Online Prediction Serving System T8P1 02/26
" MacroBase: Prioritizing Attention in Fast Data
Data Sourcing for ML
03/01 ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models T9P1 02/28
" The Data Linter: Lightweight, Automated Sanity Checking for ML Data Sets
03/06 Snorkel: Rapid Training Data Creation with Weak Supervision T9P2 03/05
" Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets
ML Systems in Production
03/08 Machine Learning: The High-Interest Credit Card of Technical Debt T10P1
" Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective
" TFX: A TensorFlow-Based Production-Scale Machine Learning Platform
03/13 Project Presentations
03/15 Project Presentations