CSE 291A: Advanced Data Analytics and ML Systems

Course Overview and Goals

This is a research-oriented course on the emerging area of advanced data analytics and ML systems, at the intersection of data management, ML/AI, and systems. This area is a driving force behind several modern data-driven applications that use large-scale machine learning to analyze large and complex datasets, including enterprise business intelligence, healthcare, recommendation systems, social media analytics, Web search, Web security, and Internet of Things. Students will learn about the latest research in this area and get hands-on experience doing either a research project or an in-depth survey of one of the course topics.

Administrivia

Lectures: TueThu 12:30-1:50pm; CSE 2154

Instructor: Arun Kumar; Office: CSE 3218; Office Hours: Thu 2:00-3:00pm

TA: TBD (TBD [at] eng.ucsd.edu)

Piazza: TBD

Announcements

  • The first class is on Thursday 01/09/18.

Pre-requisites

Courses on machine learning, database systems, and operating systems (at UCSD or elsewhere), with good grades in both courses, or prior research experience in a relevant topic, subject to the consent of the instructor.

Enrollment

  • Enrollment to this course is capped at 20 students (PhD and MS), with the enrollment decisions made the instructor in the first week of classes.

  • Fill out this questionnaire before 11:59 PM Sunday 01/07/18, if you want to enroll in this course. Each student has to fill it out individually. The enrollment decisions will depend upon the answers to these questions. The questionnaire asks about your academic background and research experience, asks you to review a research paper, and poses some open-ended research questions.

  • This 4-credit course will count towards the credit requirements for the MS database and AI/ML concentrations (and possibly, the Systems concentration; this is TBD).

Course Project

  • Each student has to do either a research project or a survey project. The choice has to be indicated in the questionnaire and will have to finalized before 11:59 PM Thursday 01/18 with an email sent to the instructor.

  • Research Project: Students choosing to do a research project are encouraged to propose a relevant problem (subject to the consent of the instructor) or choose a problem suggested by the instructor and email the final choice to the instructor before 11:59 PM Thursday 01/18. The research projects will ideally lay the groundwork for a publication at a top research conference or workshop. Students are encouraged to do individual research projects but groups of two students each are permitted for projects with a larger scope. Each student (or group) has to meet one-on-one with the instructor at a mutually scheduled half-hour slot every week to discuss the progress on the project.

  • Survey Project: Students choosing to do a survey project have to pick one of the course topics and survey the major relevant research papers on that topic. The final choice must be emailed to the instructor before 11:59 PM Thursday 01/18. Survey projects are restricted to be individual projects and are expected to provide a comprehensive analysis of a topic beyond just summarizing the papers as a laundry list. Each student has to meet with the instructor at a mutually scheduled slot once within the first month of the class to finalize the list of papers that will be surveyed.

  • Project Report: Each student (or group) has to submit a paper-style report of length 6-12 pages on their research project or survey project by the end of the course. The ACM SIG proceedings LaTeX template should be used for the report. The deadline for emailing the report is 11:59 PM Tuesday 03/20.

Course Content and Format

  • The course will be based primarily on a reading list of 30 recent papers from top conferences such as SIGMOD, VLDB, NIPS, ICLR, NSDI, and OSDI, organized into topics.

  • Each student has to read and submit their individual reviews of about 18 specified papers in the reading list by the deadline corresponding to each paper. The reviews will have a prescribed format and will have to be submitted via a given Google Form (not via email). See the Schedule for more details.

  • There will be two 75-minute lectures per week (Tue and Thu) by the instructor on the topics, techniques, and papers (mostly from the reading list). Each topic will span about 2 lectures. The lectures will also involve discussions about the reading list papers. All students are expected to read all the assigned papers and participate in the discussions.

  • Each student is expected to present a 15-min talk about their project to the class. The presentation dates are Tuesday 03/13 and Thursday 03/15.

Grading

  • 50%: Performance on the research or survey project, including the project report

  • 25%: Quality and thoroughness of paper reviews

  • 15%: Project presentation

  • 10%: Participation in the lectures/discussions

Project Performance Metrics

The key metrics for the survey projects are diligence, precision, and technical depth, while creativity and independence are additional metrics for the research projects. The students that perform outstandingly in the research projects will be encouraged to continue working on the project under the instructor's guidance to publish at a top research venue.

Classroom Rules

  • No late days for submitting the paper reviews and the project report.

  • If plagiarism is detected in the paper reviews, the University authorities will be notified immediately for appropriate disciplinary action to be taken.