Reading and Schedule

Below are the tentative schedule and reading list of this course.

Date Reading Lead
9/27 Datacenter Overview
The Datacenter as a Computer -- An Introduction to the Design of Warehouse-Scale Machines (Ch 1,2,6,7. Briefly Ch 3,4,5)
Questions

No need to submit anything for this reading

Additional Readings

  1. Building Large-Scale Internet Services (Google)

Yiying
Slides
10/2 Cloud Overview
Above the Clouds: A Berkeley View of Cloud Computing
Questions

  1. Name three pros and three cons of cloud.
  2. Despite the obstacles listed in the paper, cloud has happened and it is almost everywhere in our lives now. Why do you think is the fundamental reasons behind its success?
  3. What do you think is the future of cloud computing?

Additional Readings

  1. Amazon AWS
  2. Microsoft Azure
  3. Google Cloud Platform (GCP)
  4. XaaS article 1
  5. XaaS article 2

Yiying
Slides
10/4 Virtualization
Comet Book Chapter on Virtual Machine Monitors
Questions

  1. During normal application run time (when the application does not cause any traps), does running the application in a VM have any performance overhead?
  2. How can VMM know when to install a shadow page table entry? What exactly happens when a VM wants to create a new page table entry on a hardware-managed-TLB platform?
  3. Is there any way to reduce the overhead of the return path of a trap (steps 3, 4, 5 in Figure B.3)

Additional Readings

  1. Memory Resource Management in VMware ESX Server
  2. Disco: Running Commodity Operating Systems on Scalable Multiprocessors (TOCS'97)
  3. Scale and Performance in the Denali Isolation Kernel
  4. Xen and the Art of Virtualization
  5. Difference Engine: Harnessing Memory Redundancy in Virtual Machines
  6. The Turtles Project: Design and Implementation of Nested Virtualization
  7. vIC: Interrupt Coalescing for Virtual Machine Storage Device IO
  8. ELI: Bare-Metal Performance for I/O Virtualization
  9. A Comparison of Software and Hardware Techniques for x86 Virtualization
  10. Software Techniques for Avoiding Hardware Virtualization Exits
  11. Live Migration of Virtual Machines
  12. Remus: High Availability via Asynchronous Virtual Machine Replication

Yiying Slides
10/9 Container
Understanding and Hardening Linux Containers (mainly Ch 2 to Ch 5; you can ignore many of the details in these chapters. Read Ch 1 for more background on virtualization. Read other chapters if you are interested in security.)
Questions

  1. What types of isolations does Linux containers achieve?
  2. Can one Linux container affect the performance of another Linux container on the same machine (i.e., performance isolation)? Why or why not?
  3. Why do you think containers are less "secure" than virtual machines?

Additional Readings

  1. LXC/LXD
  2. Docker
  3. Kubernetes
  4. Unikernels: Library Operating Systems for the Cloud
  5. My VM is Lighter (and Safer) than your Container
  6. Borg, Omega, and Kubernetes (Google)
  7. Slacker: Fast Distribution with Lazy Docker Containers
  8. Amazon Fargate
  9. Kata Containers

Yiying Slides
10/11 Serverless
Cloud Programming Simplified: A Berkeley View on Serverless Computing (alternative link)
Questions

  1. Current datacenters use container as the host to run serverless functions. Do you think that is a good way? Why and why not?
  2. Today's serverless functions are stateless. How do you think different functions can share data and communicate?
  3. Can you think of any security threats of serverless computing? Bonus points if you can outline a real threat/attack.

Additional Readings

  1. Amazon Lambda
  2. Google Cloud Functions
  3. Azure Functions
  4. Amazon Firecracker
  5. Pocket: Elastic Ephemeral Storage for Serverless Analytics (OSDI'18)
  6. Occupy the Cloud: Distributed Computing for the 99% (PyWren)
  7. SAND: Towards High-Performance Serverless Computing
  8. Taking the Cloud-Native Approach with Microservices
  9. Microservices by James Lewis and Martin Fowler
  10. Introduction to Microservices by Nginx

Lihao, Zhipeng, Yihan
10/16 Resource Disaggregation
LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation (OSDI'18)
Questions

  1. What are the major benefits and weaknesses of resource disaggregation?
  2. List the steps that happen in LegoOS when an application allocates new virtual memory (e.g., calling malloc) and the steps that happen when it first accesses an allocated memory.
  3. Do you think it is a good idea to build serverless systems on top of a resource-disaggregated datacenter? Why or why not? (Bonus points for answering "how to build one?")

Additional Readings

  1. Disaggregated Memory for Expansion and Sharing in Blade Servers
  2. Scale-Out NUMA
  3. Shoal: A Lossless Network for High-density and Disaggregated Racks
  4. Flash Storage Disaggregation
  5. Understanding Rack-Scale Disaggregated Storage
  6. R2C2: A Network Stack for Rack-scale Computers

Zhiyuan
10/18 Historical
The Amoeba Distributed Operating System - A Status Report

Security
A Systematic Evaluation of Transient Execution Attacks and Defenses
Towards Trusted Cloud Computing
Questions

  1. What are the targeted usages of Amoeba?
  2. Name at least two advantages and one disadvantage of Amoeba's processor pool + specialized servers model.
  3. Why does Amoeba choose to use "immutable" files? What are the advantages and what type of workloads can benefit from this design?
  1. Choose one of the transient execution attack senario listed in the first paper and develop a realistic attack senario in the cloud on top of it. Specifically, your victim and attacker should both be cloud users. Describe who are the victim/attacker, how does the attacker do harm with transient execution, and what is the outcome of the attack. You do not need to develop any real attack. Just think of a story.
  2. Choose two defenses discussed in the two papers and discuss their implication on application performance (compared to no defense).
  3. After reading these two papers, do you trust cloud more? Do you think security will remain a key challenge in cloud computing?

Additional Readings

  1. The Sprite Network Operating System
  2. A Comparison of Two Distributed Systems: Amoeba and Sprite
  3. Distributed Shared Memory: A Survey of Issues and Algorithms
  4. Distributed Shared Memory: Concepts and Systems
  5. Shasta: A Low Overhead, Software-Only Approach for Supporting Fine-Grain Shared Memory
  1. Amazon Web Services: Overview of Security Processes (Choose any topics you find interesting to read)
  2. Meltdown: Reading Kernel Memory from User Space
  3. Spectre Attacks: Exploiting Speculative Execution
  4. On the Meltdown & Spectre Design Flaws (by Mark Hill)
  5. Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds
  6. Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors
  7. Intel SGX Explained
  8. Flush+Reload: a High Resolution, Low Noise, L3 Cache Side-Channel Attack
  9. Flush+Flush: A Fast and Stealthy Cache Attack
  10. Cache Template Attacks: Automating Attacks on Inclusive Last-Level Caches
  11. Cache attacks and countermeasures: the case of AES
  12. Last-Level Cache Side-Channel Attacks are Practical
  13. Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors
  14. Throwhammer: Rowhammer Attacks over the Network and Defenses
  15. Pythia: Remote Oracles for the Masses
  16. Oblivistore: High performance oblivious cloud storage
  17. Shroud: Ensuring private access to largescale data in the data center
  18. Taostore: Overcoming asynchronicity in oblivious data storage

Audrey, Zhipeng
10/23 Concensus
ZooKeeper: Wait-free coordination for Internet-scale systems (ATC'10)
Questions

  1. What's the goal of ZooKeeper? Why does ZooKeeper wants to supporting asynchronous (or wait-free) requests?
  2. Why is ordering of wait-free requests important in ZooKeeper?
  3. Why does ZooKeeper use replicated databases with snapshots and write-ahead logs? Is it enough to just store everything in memory?

Additional Readings

  1. The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI'06)
  2. In Search of an Understandable Consensus Algorithm (ATC'14)
  3. Paxos Made Simple
  4. Chain Replication for Supporting High Throughput and Availability (OSDI'04)
  5. The Dangers of Replication and a Solution (SIGMOD'96)
  6. Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System (SOSP'95)
  7. The Byzantine Generals Problem
  8. Byzantine Generals in Action: Implementing Fail-Stop Processors
  9. Weighted Voting for Replicated Data
  10. The Chubby Lock Service for Loosely-Coupled Distributed Systems (OSDI'06)
  11. Time, Clocks, and the Ordering of Events in a Distributed System
  12. Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems
  13. Consensus on Transaction Commit

Wenquan, Chih-hung
10/25 Storage
The Google File System (SOSP'03)
Dynamo: Amazon’s Highly Available Key-value Store
Questions

  1. What design decisions in GFS still make sense after a decade? What do you think were bad decisions?
  2. Why does GFS map full path names to metadata instead of the traditional file system's way of maintaining "directory"
  3. What is eventual consistency and why does Amazon choose this consistency level?
  4. How and when does Dynamo resolves conflicts? Why do you think Amazon and Google choose their data consistency level and storage model?

Additional Readings

  1. GFS: Evolution on Fast-forward
  2. Finding a Needle in Haystack: Facebook's Photo Storage
  3. Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency
  4. Fast Crash Recovery in RAMCloud
  5. Cassandra - A Decentralized Structured Storage System
  6. TAO: Facebook’s Distributed Data Store for the Social Graph

Kaiqi, Qianqian, Zhanghan
10/30 Database
Choosing A Cloud DBMS: Architectures and Tradeoffs (VLDB 2019)
Questions

  1. Think of two metrics that you think are valuable to measure but the paper did not measure
  2. Which database system among the ones tested in this paper do you think fit the serverless computing model the most? Why?
  3. If you are building a website and have your web service running at AWS, which database system from the paper will you choose to store customer account info? and which for storing shopping cart data? why?

Additional Readings

  1. Spanner: Google’s Globally-Distributed Database
  2. Bigtable: A Distributed Storage System for Structured Data
  3. Spark SQL: Relational Data Processing in Spark
  4. The Snowflake Elastic Data Warehouse
  5. Transaction Management in the R* Distributed Database Management System

Saurabh
11/1 Networking
A Scalable, Commodity Data Center Network Architecture (SIGCOMM'08)
A Clean Slate 4D Approach to Network Control and Management
Questions

  1. FatTree has many benefits and thus is widely deployed in many datacenters. What do you think is the main reason of its success? Can you think of a disadvantage/limitation of FatTree?
  2. FatTree's two-level routing is largely static (i.e., path is decided mostly on destination host IP) and FatTree does not do any congestion control. Do you think this can work well in reality in datacenters?
  3. In a way, the SDN approach resembles classical distributed systems (e.g., think of GFS). What problems in distributed systems do SDN also face? What problems in dist sys that SDN does not have? And what problems only SDN has but not distributed systems? Give one to two examples for each question.
  4. Software-define datacenter and software-defined storage (and other software-defined things) are hot topics in recent years (ref: the first two papers in recommended readings). Do you think "software-defined" is just a buzz word or do we really have a similar problem to solve and similar approach in other parts of datacenters. Give a reason why or why not you believe there should be a "software-defined storage" (some calls it SDS).

Additional Readings

  1. PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric
  2. Data Center TCP (DCTCP)
  3. Understanding Lifecycle Management Complexity of Datacenter Topologies
  4. OpenFlow Enabling Innovation in Campus Networks
  5. Onix: A Distributed Control Platform for Large-scale Production Networks
  6. FBOSS: Building Switch Software at Scale (Facebook)
  7. A Large Scale Study of Data Center Network Reliability (Facebook)
  8. TIMELY: RTT-based Congestion Control for the Datacenter
  9. Technology-Driven, Highly-Scalable Dragonfly TopologyChronos: Predictable Low Latency for Data Center Applications
  10. Jellyfish: Networking Data Centers Randomly
  11. Flattened Butterfly : A Cost-Efficient Topology for High-Radix Networks
  12. U-Net: A User-Level Network Interface for Parallel and Distributed Computing
  13. Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers
  14. Leveraging Endpoint Flexibility in Data-Intensive Clusters
  15. Software Defined Batteries
  16. Reading list of SDN

Rui, Jiaxiang
11/6 Remote Memory
FaRM: Fast Remote Memory (NSDI'14)
Questions

  1. Why does FaRM use large 2GB pages?
  2. How many network round trips does a (not read-only) transaction take in FaRM?
  3. What is epoch used for in FaRM

Additional Readings

  1. LITE Kernel RDMA Support for Datacenter Applications
  2. Remote Regions: a Simple Abstraction for Remote Memory
  3. Efficient Memory Disaggregation with Infiniswap
  4. HPE Memory-Driven Computing
  5. Using RDMA Efficiently for Key-value Services
  6. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs
  7. Using Onesided RDMA Reads to Build a Fast, CPU-efficient Key-value Store
  8. Datacenter RPCs can be General and Fast (NSDI'19)
  9. A Double-Edged Sword: Security Threats and Opportunities in One-Sided Network Communication
  10. Deconstructing rdma-enabled distributed transactions: Hybrid is better!

Jie, Yi, Hao
11/8 Resource Management
Large-scale cluster management at Google with Borg (EuroSys'15)
Questions

  1. How does Google use quota to have different policy for high- and low-priority jobs?
  2. Would Borgmaster be a scalability bottleneck?
  3. In general, do you think the resource management problem is hard? Do you think more "smarter" mechanisms like machine learning would be a better solution?

Additional Readings

  1. Borg, Omega, and Kubernetes (Google)
  2. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms (SOSP'17)
  3. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
  4. Apache Hadoop YARN: Yet Another Resource Negotiator
  5. Resource Control @ FB

Haolan, Anmol
11/13 Dataflow
MapReduce: Simplified Data Processing on Large Clusters (OSDI'04)
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing (NSDI'12)
Questions

  1. MapReduce uses a very simple programming model with just two types of functions: map and reduce. Do you think this abstraction is enough to express and implement most datacenter applications? Can you give an example that is difficult to write with MapReduce?
  2. MapReduce is probably the most widely adopted idea from a systems paper in the past decade. Why do you think are the reasons behind this?
  3. Spark gained a lot of attention and usage in the last few years. Compared to MapReduce, what technical advantages does the Spark system have? What about broader reasons behind its success?
  4. MapReduce was implemented in C++. Hadoop (open source version of MapReduce) was implemented in Java. Spark was implemented in Scala. Why do you think they made the decisions to use these languages?

Additional Readings

  1. Dryad: Distributed Data-parallel Programs from Sequential Building Blocks (Microsoft)
  2. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language

Minxiang, Li-An
11/15 Systems and Machine Learning
A Berkeley View of Systems Challenges for AI
Questions

  1. Can you think of a way to run composable AI with the serverless computing platform? What would you put into a serverless function? Whata data (states) need to be communicated/stored?
  2. Could data flow systems like Hadoop and Spark be used to implement ML training/inference? Do you think that's a good idea?
  3. Other than the challenges mentioned in the paper, could you list two other challenges of AI/ML?

Additional Readings

  1. TensorFlow: A System for Large-Scale Machine Learning (OSDI'16)
  2. A Berkeley View of Systems Challenges for AI
  3. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning (OSDI'18)
  4. Ray: A Distributed Framework for Emerging AI Applications
  5. Project Adam: Building an Efficient and Scalable Deep Learning Training System
  6. Large Scale Distributed Deep Networks
  7. Scaling Distributed Machine Learning with the Parameter Server
  8. Mastering the game of Go with deep neural networks and tree search
  9. Deepmind Publications
  10. Playing Atari with Deep Reinforcement Learning

Side, Siman, Palash
11/20 Streaming / Video
SVE: Distributed Video Processing at Facebook Scale
Questions

  1. The paper does not have any consistency discussion. Can there be any consistency issues (e.g., with concurrent data accesses or parallel processing)? Why or why not?
  2. During encoding, there is one step that cannot be parallelized. What is it?
  3. Why is livestreaming a mismatch for SVE?
  4. Do you think video processing is a good use case of serverless computing? How would you design a video processing system on a serverless computing platform?

Additional Readings

  1. Popularity Prediction of Facebook Videos for Higher Quality Streaming
  2. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads
  3. Discretized Streams: Fault-Tolerant Streaming Computation at Scale (Spark Streaming)
  4. Taiji: Managing Global User Traffic for Large-Scale Internet Services at the Edge (Facebook)
  5. StreamScope: Continuous Reliable Distributed Processing of Big Data Streams (Microsoft)

Ryan, Jingwen, Weiwei
11/22 Hardware
A Cloud-Scale Acceleration Architecture (Microsoft FPGA)
Amazon Nitro (esp. the video talk on that page)
Questions

  1. One of the unique designs of Catapult V2 (this paper) is to place FPGA as a "bump in the wire". Discuss the pros and cons of this design vs. 1) a design that swaps the location of NIC and FPGA, and 2) a design without NIC.
  2. Microsoft has not used FPGAs in their datacenters for cloud services. On the hand, Amazon and Alibaba both offer FPGA as a cloud service (called F1 and F3). Do you think FPGA should be used only for datacenter internal usages (like Microsof) or used only for cloud or both? Choose one of these three options that you advocate for and briefly discuss the potential challenges and benefits of it.
  3. With Amazon Nitro, virtualization functions are mostly offloaded to hardware. Do we still need a hypervisor (or an OS)? Can everything just run in user space and interact with Nitro cards directly?
  4. Instead of building different ASIC cards for different functionalities (the appraoch Amazon is taking with Nitro), one could also use the same FPGA cards but configure them differently for different functionalities (the Microsoft approach). Discuss the pros and cons of them (e.g., on performance, $$ cost, etc.)

Additional Readings

  1. A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services (Microsoft Catapult V1)
  2. In-Datacenter Performance Analysis of a Tensor Processing Unit (ISCA'17)
  3. KV-Direct: High-Performance In-Memory Key-Value Store with Programmable NIC
  4. Azure Accelerated Networking: SmartNICs in the Public Cloud
  5. FPGAs in the Cloud: Should you Rent or Buy FPGAs for Development and Deployment?

Shu-Ting, Yizhou, Xuhao
12/4 Case Study: Databricks and an Interview with Ali Ghodsi (Databricks CEO)
Course Summary Hints for Computer System Design -- Butler Lampson
Questions

Read the "Hints for Computer System Design" paper and summarize what you have learned over the course. Feel free to write about anything else you want to comment on the course.

Additional Readings

  1. Databricks Blog

Yiying
12/6 Project Presentations