Arun Kumar

Associate Professor
Computer Science and Engineering
and Halicioglu Data Science Institute
and HDSI Faculty Fellow
University of California, San Diego
Email: akk018 [at] ucsd [dot] edu
Office: 3218 CSE (EBU3B) and 351 HDSI

Bio

Arun Kumar is an Associate Professor in the Department of Computer Science and Engineering and the Halicioglu Data Science Institute and an HDSI Faculty Fellow at the University of California, San Diego. He is a member of the Database Lab and Center for Networked Systems and an affiliate member of the AI Group. His primary research interests are in data management and systems for machine learning/artificial intelligence-based data analytics. Systems and ideas based on his research have been shipped as part of products from, or used internally by, multiple cloud, Web, and database systems companies, including Google, Facebook, Oracle, and VMware. He is a recipient of three SIGMOD research paper awards, five distinguished reviewer/metareviewer awards from SIGMOD/VLDB, the IEEE TCDE Rising Star Award, an NSF CAREER Award, a UCSD oSTEM Faculty of the Year Award, and research award gifts from Amazon, Google, Oracle, and VMware. His first PhD graduate received the ACM SIGMOD Jim Gray Doctoral Dissertation Award.

Curriculum Vitae | Research Blog | Youtube Channel | On Twitter | On Tumblr

Note: I am not currently looking for new advisees or mentees. Feel free to check out the research of other faculty at CSE or HDSI.

Research

My current research focuses on the foundations of advanced data analytics systems that help make the process of building and deploying ML/AI-powered data analytics applications easier (improving the productivity of data scientists and ML/software engineers) and faster (improving runtime performance and introducing accuracy trade-offs). Thus, the key themes of my research are usability, developability, performance, and scalability. I enjoy working on problems that are motivated by real applications and are formally grounded. I also enjoy insightful conversations with practitioners on the frontlines of data analytics.

More details about my research are available on my research group webpage, including current projects, and all of our publications.

For a summary of my current research, you can also read this one-pager, listen to this podcast, or watch this talk video.

Teaching

CSE 234: Data Systems for Machine Learning (previously CSE 291): Winter 2024, Winter 2023, Fall 2021, Fall 2020
CSE 132C (formerly CSE 190): Database System Implementation: Spring 2023, Spring 2022, Spring 2021, Spring 2020, Spring 2019, Spring 2018, Spring 2017
DSC 208R: Data Management for Analytics: Winter 2023
DSC 102: Systems for Scalable Analytics: Fall 2022, Winter 2022, Winter 2021, Winter 2020
CSE 232A: Graduate Database Systems: Fall 2019, Fall 2018
CSE 291: Advanced Data Analytics and ML Systems (now CSE 234): Winter 2019, Winter 2018, Winter 2017
CSE 239: Database Seminar: Fall 2021, Fall 2020, Fall 2019
CSE 290: Seminar on Integrative AI Engineering: Fall 2018
CSE 290: Seminar on Advanced Data Science: Fall 2017, Spring 2017
CS 564: Database Management Systems: Design and Implementation (Fall 2015 at UW-Madison)

Advising

Current:

Kyle Luoma (PhD, CSE, UCSD); Co-advisor: Jingbo Shang
Xiuwen Zheng (PhD, CSE, USCD); Co-advisor: Amarnath Gupta

Alumni:

Kabir Nagrecha (PhD, CSE, UCSD; 2024); Co-advisor: Hao Zhang; First employment: Netflix
Yuhao Zhang (PhD, CSE, UCSD; 2023); First employment: Databricks
Pradyumna Sridhara (MS, CSE, UCSD, 2023); First employment: UCSD HDSI
Tanay Karve (MS, CSE, UCSD, 2022); First employment: Apple
Vignesh Nanda Kumar (MS, CSE, UCSD, 2022); First employment: ServiceNow
Supun Nakandala (PhD, CSE, UCSD, 2022); First employment: Databricks
Vraj Shah (PhD, CSE, UCSD, 2022); First employment: IBM Research Almaden
Liangde Li (MS, CSE, UCSD, 2022); First employment: TigerGraph
Tara Mirmira (MS, CSE, USCD, 2022); First employment: PhD at UCSD
Advitya Gemawat (BS, HDSI, UCSD, 2021); First employment: Microsoft NERD AI.
Kabir Nagrecha (BS, CSE, UCSD, 2021); First employment: PhD at UCSD.
Shaoqing Yi (BS, HDSI and Math, UCSD, 2021); First employment: PhD at UC Berkeley.
Side Li (MS, CSE, UCSD, 2021); First employment: Google.
Kevin Yang (BS, CSE, UCSD, 2020); First employment: MS at UPenn
David Justo (MS, CSE UCSD, 2019); Co-advisor: Nadia Polikarpova; First employment: Microsoft
Anthony Thomas (MS, CSE, UCSD, 2018); First employment: PhD at UCSD
Lingjiao Chen (MS, CS, UW-Madison, 2018); First employment: PhD at Stanford
Side Li (BS, CSE, UCSD, 2018); First employment: Amazon
Mingyang Wang (MS, CSE, UCSD, 2017); First employment: Amazon

Service

Organization:

Program Co-Chair (Research Track), ACM CODS-COMAD 2024
Associate Editor, ACM SIGMOD 2024
Associate Editor, Scalable Data Science Category, VLDB 2022, 2021 (Inaugural)
Co-Chair, Diversity and Inclusion, ACM SIGMOD 2021 (Inaugural)
Core Committee member, Diversity & Inclusion in DB Initiative, 2021 (Inaugural)
(Inaugural) Lead Organizer, SoCal DB Day 2018
Co-Chair, ACM SIGMOD Workshop on Data Management for End-to-End Machine Learning (DEEM) 2018
(Inaugural) Organizing Committee, ACM SIGKDD Workshop on Common Model Infrastructure (CMI) 2018
Organizing Committee, Extremely Large Databases (XLDB) 2018

Program Committee:

ACM SIGMOD: 2024, 2020, 2019, 2018, 2017
ACM CODS-COMAD: 2024
CIDR: 2023, 2022, 2021
IEEE ICDE 2023 Special Track Senior PC
ACM SIGMOD DEEM Workshop: 2023, 2022, 2021, 2020, 2019, 2017
VLDB: 2022, 2021, 2020, 2019, 2018
ACM SIGMOD HILDA Workshop: 2022
MLSys / SysML: 2020, 2019
ACM SIGMOD 2017 Demonstrations; Student Research Competition
IEEE ICDE 2017
USENIX HotCloud 2016
ACM SIGMOD 2016 Undergraduate Research Poster Competition

Reviewer / External:

ACM SIGMOD 2022
ACM Transactions on Database Systems (TODS) 2017, 2015
IEEE Transactions on Knowledge and Data Engineering (TKDE) 2014

Outreach Materials

Blog Posts and Talks:

2022 Oct. Talk on how to grow from grant rejections at a panel and webinar for early career researchers at SC 2022 conference.
2022 Jan. Two-part blog post on my experiences and lessons as a millennial Assistant Professor in CS: Part 1 and Part 2.
2021 Oct. Talk on being out in CS at the annual General Body Meeting of oSTEM UCSD Chapter.
2021 Jun. Talk on Mentoring and CS Careers at the UCSD ABLE End of Year Celebration for high school girls interested in computing.
2021 May. Talk on MAP and STEM Careers at the UCSD MAP Symposium for high school students and their parents.
2021 May. Blog post on my experiences with rejections in academia (papers, grants, etc.) to help reduce survivorship bias and impostor syndrome that are pervasive in academia. Added addendum on rejections to my CV.
2021 Apr. Award acceptance talk at ICDE 2021 for my TCDE Rising Star Award: Video and Slides PDF.
2021 Apr. Short talk on free speech and inclusivity at UCSD CSE faculty meeting inclusion minutes (PDF of slides).
2020 Feb. ACM SIGMOD Blog post on the new PVLDB research track category Scalable Data Science.
2020 Feb. Blog post on inclusive CS examples to raise awareness of the importance of inclusion and avoiding exclusionary examples in CS.
2019 Jul. Talk on data science careers to high school students at SDSC's REHS summer workshop and QI's Big Data summer camp at UCSD.
2019 Jun. Article on ACM SIGARCH Blog about a SIGMOD 2019 research paper of mine.
2018 Aug. ACM SIGMOD Blog post on the panel discussion at ACM SIGMOD DEEM Workshop 2019 to raise awareness of the evolving role of the database research community in the ML systems/applications arena.
2018 Apr. Blog post on the culture wars of the database/data management community (c. early 2018) to raise awareness of the inherent intellectual multiculturalism of the research area.
2017 Oct. Blog post on my coming out experience from my sociocultural and individual vantage point.

Interviews and Panels:

2021 Dec. Interviewed by TWIML for a podcast on my research agenda and worldview, as well as projects Cerebro and SortingHat.
2021 Dec. Interviewed by Stoodnt on advice to international students interested in grad school in CS and Data Science.
2021 Aug. Panelist for VLDB 2021 panel discussion on "The Future of Data(base) Education."
2021 May. Interviewed by Software Engineering Daily for a podcast on my research agenda and worldview, as well as projects Cerebro and SortingHat.
2021 Feb. Interview by UCSD oSTEM Chapter on my research and thoughts on the LGBTQ+ community in CS.
2020 Nov. Panel discussion on Impostor Syndrome organized by UCSD oSTEM Chapter.
2020 Nov. Panel discussion on AI ethics organized by UCSD Office of Innovation and Commercialization.
2020 Feb. Interviews by UCSD's Data Science Student Society on scalable analytics and career options and on ethical risks/benefits of AI and other emerging technologies.
2019 Apr. Interviewed by Software Engineering Daily at Strata Data Conference 2019 for an article on systems for ML.
2018 Feb. Interview by SIGMOD 2018 WebDB Workshop for an ACM SIGMOD Blog article on the intersection of ML and data systems.

News and Other Resources:

2023 Jul. Featured by UCSD Today to celebrate out STEM faculty for San Diego Pride Week.
2021 Jul. Article in SIGMOD Record on the inaugural year of the D&I in DB Initiative.
2022 Jun. Article in SIGMOD Record on the inaugural year of VLDB Scalable Data Science Research track category.
2021 Apr. Heartbreak of a CS Fool; an essay/poem on the CS field.
2021 Feb. Newsletter article by UCSD CNS on my group's research on scalable and optimized systems for deep learning.
2021 Feb. Website of the D & I in DB initiative for unifying Diversity & Inclusion efforts across database conference venues.
2020 Apr. Article for SIGMOD 2021 on inclusion and diversity in research writing.