CSE 190D – Topics in Database System Implementation

!!! This website is archived. Please see the website of the latest edition of this course among the links listed here. !!!

Lectures: MWF 6:00-6:50pm @ PCYNH 106

Instructor: Arun Kumar

  • Email: arunkk [at] eng.ucsd.edu

  • Office: Room 3218 EBU3b (CSE building)

  • Office Hours: Thu 2:00-3:00pm

Teaching Assistant: Digvijay Karamchandani

  • Email: dkaramch [at] eng.ucsd.edu

  • Office: Room b260a (CSE building basement)

  • Office Hours: Fri 4:00-5:00pm

Announcements

Course Goals and Content

This is a hands-on systems-focused course on the implementation of a database management system (DBMS), especially, a relational DBMS (RDBMS). RDBMSs are the cornerstone of large-scale data management in numerous application domains that define our modern world, including finance, insurance, retail, logistics, telecommunications, healthcare, governance, and education. Furthermore, concepts developed in the context of RDBMSs are indispensable for the underpinnings of the so-called Big Data and NoSQL systems that were developed for new applications such as Web search, e-commerce, social media, and advanced analytics.

This course will cover key systems topics in implementing an RDBMS: data storage, buffer management, indexing, sorting, relational operator implementations, a bit of query optimization, and a bit of transaction management and concurrency control. The implementation of newer Big Data systems such as Spark and MapReduce/Hadoop, as well as distributed NoSQL/key-value stores, in-memory RDBMSs, and streaming DBMSs will likely be covered too.

A major component of this course is hands-on C++ programming to implement two key components of an RDBMS: a buffer manager and a B+-Tree index – on top of a basic RDBMS skeleton that will be provided.

Course Format

  • The class meets 3 times a week for 50-minute lectures.

  • A midterm exam and a final exam.

  • 2 programming projects. Students can work on projects either individually or in teams of 2. Students should email their team decisions to the TA and instructor before 11:59pm Monday 04/17. All remaining students will be assigned to teams randomly by the TA.

  • A few short ungraded quizzes.

Prerequisites

  • CSE 132A is absolutely essential. CSE 120 and 132B will likely be helpful.

  • You should know, or be willing to learn quickly by yourself, the programming language C++ for the projects. Here is a good C++ tutorial.

Textbooks

  • Recommended: Database Management Systems (3rd edition), by Raghu Ramakrishnan and Johannes Gehrke (aka the "cow book").

  • Additional (optional): Database Systems: The Complete Book (2nd edition), by Hector Garcia-Molina, Jennifer Widom, and Jeffrey Ullman.

Grading

  • Project 1: 20%

  • Project 2: 30%

  • Midterm Exam: 20%

  • Final Exam: 30%

Cutoffs

These cutoffs on the total score are a minimum guarantee on the grade. The thresholds might be decreased later by the instructor but not increased.

Cutoff (>= x) Grade
96 A+
90 A
86 A-
82 B+
80 B
76 B-
72 C+
70 C
65 C-
55 D
< 55 F

Exam Dates

  • Final Exam: Wednesday, 06/14 in class from 7:00pm to 9:59pm

Classroom Rules

  • No late days for submitting the programming projects.

  • If plagiarism is detected in the project code or the exams, University authorities will be notified for appropriate disciplinary action to be taken.