CSE/DSC 234: Data-Centric AI and AI Engineering
(Previously "Data Systems for ML")
Lectures: TuTh 5-6:20pm PT @ WLH 2001
Instructor: Arun Kumar
Piazza: CSE/DSC 234
(Access code posted on Canvas and emailed to enrolled students)
Teaching Assistants:
| Name | Email |
| Ruobing Han | r8han [at] ucsd.edu |
| Manas Jain | maj039 [at] ucsd.edu |
| Raghav Jain | r6jain [at] ucsd.edu |
| Har Simrat Singh | h6singh [at] ucsd.edu
|
Announcements
Course Goals and Content
This is a research-based course on data-centric aspects of the AI lifecycle,
spanning development, deployment, and maintenance of AI applications.
It is at the intersection of the areas of ML/AI, data management, and software systems.
AI has long been ubiquitous in domains such as enterprise analytics, recommendation
systems, social media analytics, and domain sciences. The rise of LLMs has made AI
chatbots, RAG, and agentic applications pervasive, including for consumer facing applications.
Students will learn about the landscape and evolution of data-centric AI systems,
the latest research, and some major open questions.
This course is aimed primarily at MS students interested in building real-world AI
applications, as well as PhD students interested in research in this space.
Course Format and Instructions
Midterm Second Chance:
The Final Exam will have a subset designated as "Midterm Second Chance" that gives you a second
chance at raising your Midterm Exam score.
If you score higher on that subset (say, y%) vs. your original Midterm Exam score (say, x% with x < y),
then your midterm score will be automatically upgraded to (x + 2/3 * (y - x))%.
That is, you will automatically earn two-thirds of the positive delta.
But if x > y, then your original midterm score (x%) will remain unchanged.
This policy is applied by default to everyone.
Prerequisites
A full course on ML algorithms (e.g., CSE 151 or 258) is absolutely necessary. It could have been taken at UCSD or elsewhere.
Python programming knowhow is also necessary.
Introductory courses on NLP/LLMs and on databases/data management are also highly recommended but not strictly required.
Substantial project or industrial experience on relevant topics can be substituted for prior
coursework and Python experience, subject to the instructor's consent.
Email the instructor if you would like to enroll but are unsure if you satisfy the prerequisites.
Suggested Textbooks
More optional textbooks:
AI Engineering: Building Applications with Foundation Models, by Chip Huyen (O'Reilly)
Generative AI with LangChain, by Ben Auffarth and Leonid Kuligin (Packt)
Principles of Building AI Agents, by Sam Bhagwat (Mastra)
Free e-book
Exam Dates
Midterm Exam: Thu, May 7, 5-6:20pm PT in class.
Cumulative Final Exam: Thu, Jun 11, 7-10pm PT.
Grading Components
Midterm Exam: 15%
Cumulative Final Exam: 35%
AI Engineering Project 1: 20%
AI Engineering Project 2: 20%
Peer Instruction Activities: 10% (8 x 1.25%)
Grading Cutoffs
The grading scheme is a hybrid of absolute and relative grading.
The absolute cutoffs are based on your absolute total score (including any extra credit).
The relative bins are based on your position in the total score distribution of the class.
The better grade among the two (absolute-based and relative-based) will be your final grade.
The absolute cutoffs are provisional and may be adjusted at the end of the quarter at the instructor's
discretion but only in a direction that benefits students.
| Grade | Absolute Cutoff (>=) | Relative Bin (Use strictest) |
| A+ | 95 | Highest 5% |
| A | 90 | Next 15% (5-20) |
| A- | 85 | Next 15% (20-35) |
| B+ | 80 | Next 15% (35-50) |
| B | 75 | Next 15% (50-65) |
| B- | 70 | Next 10% (65-75) |
| C+ | 65 | Next 5% (75-80) |
| C | 60 | Next 5% (80-85) |
| C- | 55 | Next 5% (85-90) |
| D | 50 | Next 5% (90-95) |
| F | < 50 | Lowest 5%
|
Example: Suppose the total score is 89 and the percentile is 60.
The relative grade is B+, while the absolute grade is A-.
The final grade then is A-.
Non-Letter Grade Options: You have the option of taking this course for a non-letter grade.
As per the CSE department's guidelines, the policy for P in a P/F option is a C- or better;
the policy for S in an S/U option is a letter grade of B- or better.
CSE Comprehensive Exam: For this, your total score across all in-person proctored components
(both exams), when rescaled to percentage, must yield a pass-equivalent letter grade,
i.e., D or better, based on the grading scheme above.
Classroom Rules
Please review UCSD's honor code and policies and procedures on academic integrity
on this website.
If plagiarism is detected in your exams or project submissions, or if any other form of
academic integrity violation is identified, the University authorities will be notified
for appropriate disciplinary action to be taken.
You will also get 0 for that component of your score and get downgraded substantially.
|