CS 190D – Topics in Database System Implementation
Lectures: MWF 6:00-6:50pm @ PCYNH 106
Instructor: Arun Kumar
Teaching Assistant: TBD
Class Mailing List: TBD
Course Goals and Content
This is a hands-on systems-focused course on the implementation of a database management system (DBMS), especially, a relational DBMS (RDBMS). RDBMSs are the cornerstone of large-scale data management in numerous application domains that define our modern world, including finance, insurance, retail, logistics, telecommunications, healthcare, governance, and education. Furthermore, concepts developed in the context of RDBMSs are indispensable for the underpinnings of the so-called Big Data and NoSQL systems that were developed for new applications such as Web search, e-commerce, social media, and advanced analytics.
This course will cover key systems topics in implementing an RDBMS: data storage, buffer management, indexing, sorting, relational operator implementations, a bit of query optimization, and a bit of transaction management and concurrency control. The implementation of newer Big Data systems such as Spark and MapReduce/Hadoop, as well as distributed NoSQL/key-value stores, in-memory RDBMSs, and streaming DBMSs will also be covered.
A major component of this course is hands-on C++ programming to implement two key components of an RDBMS: a buffer manager and a B+-Tree index – on top of a basic RDBMS skeleton that will be provided.