CSE 232A: Graduate Database Systems

Administrivia

Lectures: MonWedFri 1:00-1:50pm; Mandeville Center B210

Instructor: Arun Kumar; Office: CSE 3218; Office Hours: Wed 2:00-3:00pm

TAs:

  • Chand Anand (canand [at] eng.ucsd.edu); Office Hours: Mon 2:00-3:00pm at CSE B270A

  • Gurkanwal Singh Batra (gbatra [at] eng.ucsd.edu); Office Hours: Thu 12:30-1:30pm CSE B250A

Piazza: CSE 232A page

Announcements

  • The final exam will be held in class (Mandeville Center B210) on Monday, 12/10 from 11:30am to 2:29pm.

  • Here are some questions on transaction management from a previous course: quiz, quiz with solutions, and an exam with solutions (see Question 4).

  • A review session will be held in class on Wednesday, 12/5 to work out questions from the topics covered after midterm 2; the slides are posted on the schedule page.

  • Arun will hold extra office hours on Wednesday, 12/5 from 2pm to 5pm in CSE 4262 and on Thursday, 12/6 from 2pm to 5pm in CSE 3217.

  • Answers for the midterm 2 are available here: questions only and with answers.

Course Overview and Content

This is a graduate course on the systems principles of database management systems (DBMSs), especially, relational DBMSs (RDBMSs). RDBMSs are the cornerstone of large-scale data management in numerous application domains that define our modern world, including finance, insurance, retail, logistics, telecommunications, healthcare, governance, and education. Furthermore, concepts developed in the context of RDBMSs are indispensable for the underpinnings of the so-called Big Data and NoSQL systems that were developed for new applications such as Web search, e-commerce, social media, and ML analytics. This course will cover key principles and systems design issues in RDBMSs, including storage management, query processing and optimization, transaction management, concurrency control, recovery, parallel DBMSs, and dataflow systems. More recent topics such as column stores, data integration, and data cleaning will likely be covered too. This course will overlap with CSE 190A from Spring 2018, but this course does not cover some implementation details of RDBMSs in favor of other topics and this course has no programming projects.

Course Format

  • The class meets 3 times a week for 50-minute lectures. All lectures are mandatory. While lecture slides will be made available on this webpage, additional content might be discussed in class.

  • This course will have two in-class midterm exams and one cumulative final exam. If you miss an exam, you will get no credit for it unless you duly notify the instructor with a certifiable medical or emergency reason; in such cases, your grade will be based on a proportional reweighting of the other exams.

  • There will be a few short in-class surprise quizzes to aid in revising the material. The quizzes will not be posted on the webpage nor will they be graded.

  • To encourage you to learn how to read and evaluate research papers, as well as to give you a flavor of state-of-the-art database systems research, there is an optional paper reading list aligned with the lecture schedule. We have 8 papers drawn from recent SIGMOD and VLDB, the top conferences where database/data management research is published. You have to read and submit your individual review for each paper on the corresponding Google Form before the specified deadline. The reviews will be evaluated for pertinence, thoroughness, and quality. Extra credit will be given proportional to the number of reviews submitted on time. There are no late days. Since the course is not graded on a curve, not doing this will not hurt you but doing so could improve your grade.

Pre-requisites

CSE 132A (DB Systems Principles); or an equivalent undergrad DB systems course; or substantial practical experience with RDBMSs, subject to the consent of the instructor. It will also be heplful if you have taken a course on Operating Systems, say, CSE 120 or its equivalent.

Textbook(s)

  • Recommended: Database Management Systems (3rd edition), by Raghu Ramakrishnan and Johannes Gehrke (aka the "cow book").

  • Additional (optional): Database Systems: The Complete Book (2nd edition), by Hector Garcia-Molina, Jennifer Widom, and Jeffrey Ullman.

  • Additional (optional): Big Data Integration, by Xin Luna Dong and Divesh Srivastava.

Grading

  • Midterm Exam 1: 20%

  • Midterm Exam 2: 20%

  • Cumulative Final Exam: 60%

  • (Extra Credit) Paper Reviews: 7%

Cutoffs

These cutoffs on the total score are a minimum guarantee on the grade. The thresholds might be decreased later by the instructor but not increased.

Cutoff (>= x) Grade
95 A+
90 A
85 A-
80 B+
75 B
70 B-
65 C+
60 C
55 C-
50 D
< 50 F

Exam Dates

  • Midterm Exam 1: Friday, 10/26, in classs

  • Midterm Exam 2: Wednesday, 11/21, in class

  • Cumulative Final Exam: Monday, 12/10, 11:30am to 2.29pm, in class

Classroom Rules

  • You are encouraged to ask questions and participate in in-class discussions. Please raise your hand before asking questions or speaking during the lectures.

  • Harassment or intimidation of any form against any student will not be tolerated in class.

  • If cheating is detected during an exam, the University authorities will be notified immediately for appropriate disciplinary action to be taken.

  • If plagiarism is detected in your paper reviews, you will get zero on the entire extra credit option.