CS 132C – Database System Implementation (Hybrid Modality Edition)

Lectures: MWF 4:00-4:50pm PT at WLH 2204 and on Zoom (link on Piazza)

Instructor: Arun Kumar

  • Email: arunkk [at] eng.ucsd.edu

  • Office Hours: Wed 2:00-3:00pm PT; hybrid: in-person at CSE/EBU3b 3218 and on Zoom (link on Piazza)

Discussions: Wed 6:00-6:50pm PT on Zoom only (link on Piazza)

Teaching Assistants:

  • Tanay Karve

    • Email: tkarve [at] ucsd.edu

    • Office Hours: Wed 12:30-1:30pm PT; on Zoom only (link on Piazza).

    • Extra office hours: 3:00-4:00pm PT on Apr 15 and 19; May 20, 23, and 25.

    • Handles questions/doubts regarding the programming projects.

  • Liangde Li

    • Email: lil009 [at] ucsd.edu

    • Office Hours: Thu 3:00-3:30pm PT; in-person only near CSE/EBU3b 3232.

    • Handles questions/doubts regarding the lecture materials/quizzes/exams.

Piazza: CSE 132C

Course Goals and Content

This is a hands-on systems-focused course on the implementation of a database management system (DBMS), especially, a relational DBMS (RDBMS). RDBMSs are the cornerstone of large-scale data management in numerous application domains that define our modern world, including finance, insurance, retail, logistics, telecommunications, healthcare, governance, and education. Furthermore, concepts developed in the context of RDBMSs are indispensable for the underpinnings of so-called "Big Data" and "NoSQL " systems built for new applications such as Web search, e-commerce, and social media analytics and those of emerging systems for scalable ML/AI and data science.

This course will cover key systems topics in implementing an RDBMS: data storage, buffer management, indexing, sorting, relational operator implementations, a bit of query optimization, and the implementation of so-called "Big Data" systems such as MapReduce/Hadoop and Spark. Cutting-edge topics such as cloud-native RDBMSs and ML for RDBMSs will also be covered.

A major component of this course is hands-on C++ programming to implement two key components of an RDBMS, a buffer manager and a B+ Tree index, on top of a basic RDBMS skeleton that will be provided.

Course Format and Hybrid Modality Instructions

  • The class meets 3 times a week for 50-minute lectures.

    • All lectures will be in hybrid mode: in-person and on Zoom simultaneously. The lectures will be podcast automatically afterward and can be viewed asynchronously.

    • Attendance of live lectures is not mandatory. However, students are highly encouraged to join the lectures live to participate in the discussions, surprise quizzes, and other in-class interactive activities.

    • Students are NOT required to have webcams but microphones are encouraged. The Zoom calls can be joined via phone as well.

    • We will use Piazza for asynchronous discussions and questions. Canvas Discussions is okay too.

    • See the schedule page for the schedule of lecture topics and downloadable slide decks.

  • 2 C++ programming projects.

    • Students can work on projects either in teams of 2 or teams of 1 (individual).

    • Submit this Google Form on your team preferences before 11:59pm PT Mon, Apr 4. All remaining students will be randomly paired up by the TA.

    • See the projects page for more details, including all dates/deadlines.

    • There are no late days for the programming assignments; plan your work accordingly.

    • Your (team's) code submission must be entirely your (team's) own. The projects page offers more guidance on what level of discussion outside your team is allowed.

  • Midterm exam and cumulative final exam.

    • These will be held in person only. The dates and time slots are listed below.

    • The exams will have a mix of multiple choice questions (MCQ), quantitative problems, and essay questions. Some questions will have partial credits.

    • The guideline for time per question is a max of 1min per point. The points of each question will be calibrated accordingly.

    • If you miss an exam, you will get no credit for it, unless you notify the instructor in advance with a university approved reason and receive a makeup exam slot.

    • The exams are +closed books/notes/electronics.

  • 6 extra-credit surprise quizzes in class.

    • These optional quizzes will be spread randomly throughout the quarter.

    • They will be held via Google Forms. The links will be shared only with the students who attend the respective lectures live.

    • Each quiz will have 3 multiple-choice questions (MCQ). Quantitative/longer problems may exist but only final answer is needed. No partial credits.

    • Each quiz is graded with a step function: if you get at least 2 answers correct, you get full credit for that quiz; otherwise, you get zero credit.

    • The quizzes are also open books/notes/Web. The only requirement is you should neither give nor receive help from anyone by any means.

  • I will also release some ungraded exercises on the docs page throughout the quarter. These questions will act as practice for the quizzes and exams.

Prerequisites

  • CSE 132A (DB Systems Principles) or DSC 102 (Systems for Scalable Analytics) is necessary. It will also be helpful if you have taken CSE 120 (Operating Systems) or CSE 132B (DB Systems Applications) but these are not necessary.

  • You should know, or be willing to learn quickly by yourself, the programming language C++ for the projects. Here is a good C++ tutorial.

Textbooks

  • Recommended: Database Management Systems (3rd edition), by Raghu Ramakrishnan and Johannes Gehrke (aka the "cow book").

  • Additional (optional): Database Systems: The Complete Book (2nd edition), by Hector Garcia-Molina, Jennifer Widom, and Jeffrey Ullman.

Exam Dates

  • Midterm Exam: Wed, May 4; in class

  • Final Exam: Thu, Jun 9; 3:00-6:00pm PT; room TBD

Grading

  • Project 1: 10%

  • Project 2: 35%

  • Midterm Exam: 15%

  • Final Exam: 35%

  • Peer Evaluation Activities: 5%

  • Surprise Quizzes: Up to 6% extra credit

Cutoffs

The grading scheme is a hybrid of absolute and relative grading. The absolute cutoffs are based on your absolute total score. The relative bins are based on your position in the total score distribution of the class. The better grade among the two (absolute and relative) will be your final grade.

Grade Absolute Cutoff (>=) Relative Bin (Use strictest)
A+ 95 Highest 5%
A 90 Next 10% (5-15)
A- 85 Next 15% (15-30)
B+ 80 Next 15% (30-45)
B 75 Next 15% (45-60)
B- 70 Next 15% (60-75)
C+ 65 Next 5% (75-80)
C 60 Next 5% (80-85)
C- 55 Next 5% (85-90)
D 50 Next 5% (90-95)
F < 50 Lowest 5%


Example: Suppose the total score is 82 and the percentile is 33. So, the relative grade is B-, while the absolute grade is B+. The final grade then is B+.

Non-Letter Grade Options: You have the option of taking this course for a non-letter grade. The policy for P in a P/F option is a letter grade of C- or better; for S in an S/U option is a letter grade of B- or better.

Classroom Rules

  • No late days for submitting the programming projects. Plan your work well up front accordingly.

  • Students are encouraged to ask questions and participate in the discussion during the live lecture and also on Piazza. Please raise your hand before speaking and the instructor will call on you to speak.

  • Please review all UCSD policies on pandemic-related public health and safety on this website. In particular, all are required to wear a proper mask indoors, including during lectures and OHs.

  • Please review UCSD's honor code and policies and procedures on academic integrity on this website. If plagiarism is detected in your code, or if we detect collusion on the quizzes or exams, or if any other form of academic integrity violation is identified, you will get zero for that component of your score and get downgraded substantially. I will also notify the University authorities for appropriate disciplinary action to be taken, up to and including expulsion from the University.

  • Please review UCSD's principles of community and our commitment to creating an inclusive learning environment on this website.

  • Harassment or intimidation of any form against any student will not be tolerated in class or on Piazza. Please review UCSD's policies on dealing with harassment and discrimination on this website.

  • In the rare chance of a Zoombombing during a live lecture, I will end that call and immediately announce a new link on Piazza for that lecture.