Reading and Schedule

Below are the schedule and reading list of this course.

Date Reading Lead
9/26 Course Introduction and the History of Virtualization
9/28 Background and Virtualization Overview
Comet Book Chapter on Virtual Machine Monitors
Additional Readings

  1. Formal Requirements for Virtualizable Third Generation Architectures (Comm ACM 1974)
  2. Disco: Running Commodity Operating Systems on Scalable Multiprocessors (TOCS'97)
  3. Scale and Performance in the Denali Isolation Kernel

10/3 Virtualizing CPU
A Comparison of Software and Hardware Techniques for x86 Virtualization (ASPLOS'06)

  1. Why is x86 un-virtualizable with trap-and-emulate? Give one example.
  2. How are jump instructions translated?
  3. With hardware virtualization extensions (e.g., Intel VT), do we still need binary translation? Why or why not?

Additional Readings

  1. The Evolution of an x86 Virtual Machine Monitor
  2. Software Techniques for Avoiding Hardware Virtualization Exits
  3. Embra: Fast and Flexible Machine Simulation
  4. Fast Dynamic Binary Translation for the Kernel
  5. Enabling Intel Virtualization Technology Features and Benefits

10/5 Virtualization CPU ctd. and Virtualizing Memory - 1
Performance Evaluation of Intel EPT Hardware Assist

  1. List at least one pro and one con for software MMU
  2. List at least one pro and one con for hardware MMU

Additional Readings

10/10 Virtualizing Memory - 2
Memory Resource Management in VMware ESX Server (OSDI'02)

  1. What is the double paging problem and what caused it?
  2. Would a malicious guest OS (or a buggy one) be able to access memory that it has swapped out during ballooning? Why/why not?
  3. What is the benefit of keeping a "hint" entry for each scanned (but unshared) page (as compared to not maintaining anything for the page)

Additional Readings

  1. Difference Engine: Harnessing Memory Redundancy in Virtual Machines

10/12 Virtualizing I/O
First three sections of virtio: Towards a De-Facto Standard For Virtual I/O Devices and
first three sections of High Performance Network Virtualization with SR-IOV and
Network Virtualization Overview

  1. Is virtio a full virtualization or a paravirtualization technique? What's its main benefit?
  2. List at least one limitation of SR-IOV
  3. What are the similarities and differences between network virtualization and traditional server virtualization?

Additional Readings

  1. vIC: Interrupt Coalescing for Virtual Machine Storage Device IO
  2. ELI: Bare-Metal Performance for I/O Virtualization
  3. Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor (ATC'01)
  4. Network Virtualization in Multi-tenant Datacenters (NSDI'14)
  5. The Design and Implementation of Open vSwitch (NSDI'15)

10/17 Cloud Computing
Above the Clouds: A Berkeley View of Cloud Computing
Quiz 1

  1. Why do you think cloud computing has been a huge success and gained majority of IT market?
  2. What challenges mentioned in the Bekerley cloud paper do you think still remain today?
  3. If you could change one thing about the cloud with a magic wand, what would you change?

Additional Readings

10/19 Container Basics
Understanding and Hardening Linux Containers (mainly Ch 2 to Ch 5; you can ignore many of the details in these chapters. Read Ch 1 for more background on virtualization. Read other chapters if you are interested in security.)
Quiz 1

  1. What types of isolations does Linux containers achieve?
  2. Can one Linux container affect the performance of another Linux container on the same machine (i.e., performance isolation)? Why or why not?
  3. Why do you think containers are less "secure" than virtual machines?

Additional Readings

  1. LXC/LXD
  2. Docker
  3. Understanding Security Implications of Using Containers in the Cloud
  4. Container Security: Issues, Challenges, and the Road Ahead
  5. Slacker: Fast Distribution with Lazy Docker Containers

10/24 Serverless Computing - 1
Pages 3 to 8 of Cloud Programming Simplified: A Berkeley View on Serverless Computing and briefly about Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider (ATC'20)

  1. Today's serverless functions are stateless. How do you think different functions can share data and communicate?
  2. Can you think of any security threats of serverless computing? Bonus points if you can outline a real threat/attack.
  3. Can you think of any other ways to reduce or avoid cold start for serverless computing (other than what the ATC'20 paper talks about).

Additional Readings

  1. Amazon Lambda
  2. Google Cloud Functions
  3. Azure Functions
  4. Serverless Computing: Current Trends and Open Problems
  5. Serverless Workflows with Durable Functions and Netherite
  6. Serverless Computing: One Step Forward, Two Steps Back

10/26 Serverless Computing - 2
First three sections of Pocket: Elastic Ephemeral Storage for Serverless Analytics (OSDI'18) and the first three sections of Resource-Centric Serverless Computing

  1. Why isn't using existing in-memory key-value stores such as Redis and Memcached a good option for storing ephemeral data in serverless computing?
  2. List at least two main problems of Function as a Service.
  3. What applications do you think fit FaaS, stateful serverless (like Pocket), and Resource-based serverless (Scad)? List at least one for each.

Additional Readings

  1. ORION and the Three Rights: Sizing, Bundling, and Prewarming for Serverless DAGs
  2. Occupy the Cloud: Distributed Computing for the 99% (PyWren)
  3. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads
  4. SAND: Towards High-Performance Serverless Computing
  5. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads
  6. A Case for Serverless Machine Learning
  7. Archipelago: A Scalable Low-Latency Serverless Platform
  8. Cloudburst: Stateful Functions-as-a-Service

10/31 Kubernetes

  1. What is a Kubernetes Pod? How do you think it is useful in container orchestration?
  2. What does Kubernetes use etcd for? Why is having a consistent, atomic key-value store important for Kubernetes' control plane?

Additional Readings

  1. Borg, Omega, and Kubernetes (Google)
  2. The True Cost of Containing: A gVisor Case Study
  3. Container Isolation at Scale (Introducing gVisor) - Dawn Chen & Zhengyu He, Google
  4. Nabla Containers

Adyanth Hosavalike
11/2 gVisor and Unikernel
gVisor and Unikernels: Library Operating Systems for the Cloud (ASPLOS'13)

  1. Vulnerabilities in the Linux kernel makes it unsafe for containers to call Linux system calls. How does gVisor solve this problem?
  2. Name one benefit and one drawback of compiling a single-image VM.
  3. Comparing gVisor and Unikernels, which one do you think is more secure and which is more lightweight?

Additional Readings

  1. Unikernels as Processes
  2. Unikernels are unfit for production
  3. Rethinking the Library OS from the Top-Down
  4. Mirage OS
  5. Nabla Containers
  6. ClickOS and the Art of Network Function Virtualization
  7. Libra: a library operating system for a JVM in a virtualized execution environment
  8. Exokernel: an operating system architecture for application-level resource management
  9. Dune: Safe User-level Access to Privileged CPU Features (OSDI'12)

Xuyang Cao
11/7 Amazon Firecracker
Firecracker: Lightweight Virtualization for Serverless Applications (NSDI'20)

  1. What is the benefit of Firecracker over gVisor in terms of the specific goals Amazon has for their cloud production environments?
  2. What mechanism(s) allow Firecracker to run thousands of MicroVMs on the same machine (with 10x-20x oversubscription rate)?
  3. Why do you think Firecracker (when deployed to power AWS Lambda) run one process (one slot) in one MicroVM?

Additional Readings

  1. Amazon Firecracker Git repo
  2. Kata Containers

Chirag Dasannacharya and Anze Xie
11/9 Para-Virtualization
Xen and the Art of Virtualization (SOSP'03)
Quiz 2

  1. Why can Xen allow guest OS system call handlers to be accessed directly (without any ring-0 Xen involvement) but not guest page fault handler?
  2. What's the benefit of using asynchronous event notifications from Xen to a VM?
  3. What goals of Xen are not valid or less valid in today's cloud environments?

Additional Readings

  1. Understanding Full Virtualization, Paravirtualization, and Hardware Assist
  2. Safe Hardware Access with the Xen Virtual Machine Monitor
  3. Optimizing Network Virtualization in Xen
  4. Measuring CPU Overhead for I/O Processing in the Xen Virtual Machine Monitor
  5. Breaking Up is Hard to Do: Security and Functionality in a Commodity Hypervisor (SOSP'11)

Twinkle Choudhary
11/14 KVM and QEMU
kvm: the Linux Virtual Machine Monitor,
and QEMU, a Fast and Portable Dynamic Translator (It's OK to not fully understand Section 2)

  1. What is the implication of KVM forwarding I/O requests to the user space?
  2. What is the benefit of QEMU first translating the source instructions (guest) into micro-operations implemented in C and their compiled object files and then translating the object files into the target instructions (host)?
  3. Can you think of some good use cases for QEMU+KVM?

Additional Readings

  1. KVM Documentation

Annotated Slides
11/16 Security
When Virtual is Harder than Real: Security Challenges in Virtual Machine Based Computing Environments (HotOS'05)
and Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds (CCS'09)

  1. Can you think of some drawback of enforcing security mechanisms at the hypervisor level (compared to at the guest OS or above)?
  2. When a zone and/or an instance type are used more frequently (i.e., having higher loads from more tenants), do you think the co-location attack would be come easier or harder? Why?
  3. Do you think a similar co-location attack exist with serverless computing (i.e., one function attacking another function on the same physical machine)? Does serverless computing make such attacks harder or easier and why?

Additional Readings

  1. Secure Container Isolation: Problem Statement & Solution Space
  2. When Virtual Is Better Than Real (HotOS'01)
  3. Secure Pods: Sandboxing workloads in Kubernetes
  4. TrustVisor: Efficient TCB Reduction and Attestation
  5. SecVisor: A Tiny Hypervisor to Provide Lifetime Kernel Code Integrity for Commodity OSes (SOSP'07)
  6. Breaking Up is Hard to Do: Security and Functionality in a Commodity Hypervisor (SOSP'11)
  7. InkTag: Secure Applications on an Untrusted Operating System (ASPLOS'13)
  8. Overshadow: A Virtualization-Based Approach to Retrofitting Protection in Commodity Operating Systems
  9. VirtuOS: An Operating System with Kernel Virtualization
  10. SCONE: Secure Linux Containers with Intel SGX
  11. Understanding Security Implications of Using Containers in the Cloud (ASPLOS'08)
  12. Container Security: Issues, Challenges, and the Road Ahead

Raghav Prasad and Priyanka Haresh Bhatia
11/21 New Cloud Infrastructure
Amazon Nitro (esp. the video talk on that page)

  1. With Amazon Nitro, virtualization functions are mostly offloaded to hardware. Do we still need a hypervisor (or an OS)? Can everything just run in user space and interact with Nitro cards directly?
  2. Can you think of a drawback of offloading tasks to hardware (i.e., Nitro's approach)?

Additional Readings

  1. Intel Unveils Infrastructure Processing Unit

Lavanya Karthikeyan
11/23 Next-Generation Cloud
User-Defined Cloud (HotOS'21) and From Cloud Computing to Sky Computing (HotOS'21)

  1. Other than the examples given in the UDC paper, can you think of another cloud usage case that could benefit from UDC? and how exactly it can benefit from UDC?
  2. What do you think is the biggest obstacle for cloud users to adopt sky computing? for cloud providers to adopt it?

Additional Readings

11/28 Course Summary
Hints for Computer System Design - Butler Lampson
Quiz 3

Read the "Hints for Computer System Design" paper and summarize what you have learned over the course. Feel free to write about anything else you want to comment on the course.

Yiying, Shivani Hariprasad, Shreyas Anantha-ramaprasad
11/30 Project Presentations