CSE 291A: Advanced Data Analytics and ML Systems
Course Overview and Goals
This is a research-oriented course on the emerging area of advanced data analytics and ML systems, at the intersection of data management, ML/AI, and systems.
This area is a driving force behind several modern data-driven applications that use large-scale machine learning to analyze large and complex datasets, including enterprise business intelligence, healthcare, recommendation systems, social media analytics, Web search, Web security, and Internet of Things.
Students will learn about the latest research in this area and get hands-on experience doing either a research project or an in-depth survey of one of the course topics.
Administrivia
Lectures: TueThu 12:30-1:50pm; CSE 2154
Instructor: Arun Kumar; Office: CSE 3218; Office Hours: Thu 2:00-3:00pm
TA: Digvijay Karamchandani (dkaramch [at] eng.ucsd.edu)
Piazza: CSE 291A
Announcements
Pre-requisites
Courses on machine learning, database systems, and operating systems (at UCSD or elsewhere), with good grades in both courses, or prior research experience in a relevant topic, subject to the consent of the instructor. Here are some recommended textbooks on the foundational background needed for this course:
ML: "Machine Learning" by Tom Mitchell and "Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
DB: "Database Management Systems" by Raghu Ramakrishnan and Johannes Gehrke
OS: "Operating Systems: Three Easy Pieces" by Remzi and Andrea Arpaci-Dusseau
Enrollment
Fill out the enrollment questionnaire before 11:59 PM Sunday 01/07/18 (now extended to 5:59 PM Wednesday 01/10/18), if you want to enroll in this course. Each student has to fill it out individually. The enrollment decisions will depend upon the answers to these questions. The questionnaire asks about your academic background and research experience, asks you to review a research paper, and poses some open-ended research questions.
Course Project
Course Content and Format
The course will be based primarily on a reading list of about 30 recent papers from top conferences such as SIGMOD, VLDB, NIPS, ICLR, NSDI, and OSDI, organized into topics.
Each student has to read and submit their individual reviews of about 18 specified papers in the reading list by the deadline corresponding to each paper. The reviews will have a prescribed format and will have to be submitted via a given Google Form (not via email). See the Schedule for more details. For some advice on how to read a research paper with an evaluative but also appreciative mindset, read this excellent article.
There will be two 75-minute lectures per week (Tue and Thu) by the instructor on the topics, techniques, and papers (mostly from the reading list). Each topic will span about 2 lectures. The lectures will also involve discussions about the reading list papers. All students are expected to read all the assigned papers and participate in the discussions.
Grading
50%: Performance on the research or survey project, including the project report
25%: Quality and thoroughness of paper reviews
15%: Project presentation
10%: Participation in the lectures/discussions
Project Performance Metrics
The key metrics for the survey projects are diligence, precision, and technical depth, while creativity and independence are additional metrics for the research projects.
The students that perform outstandingly in the research projects will be encouraged to continue working on the project under the instructor's guidance to publish at a top research venue.
Classroom Rules
No late days for submitting the paper reviews and the project report.
If plagiarism is detected in the paper reviews, the University authorities will be notified immediately for appropriate disciplinary action to be taken.
|