Arun Kumar

Assistant Professor
Computer Science and Engineering
and Halicioglu Data Science Institute
University of California, San Diego
Email: arunkk [at] eng [dot] ucsd [dot] edu
Office: 3218 EBU3B (CSE building)


Arun Kumar is an Assistant Professor in the Department of Computer Science and Engineering and the Halicioglu Data Science Institute at the University of California, San Diego. He is a member of the Database Lab and Center for Networked Systems and an affiliate member of the AI Group. His primary research interests are in data management and systems for machine learning/artificial intelligence-based data analytics. Systems and ideas based on his research have been released as part of the Apache MADlib open-source library, shipped as part of products from Cloudera, IBM, Oracle, and Pivotal, and used internally by Facebook, Google, LogicBlox, Microsoft, and other companies. He is a recipient of two SIGMOD research paper awards, a SIGMOD Research Highlight Award, three distinguished reviewer awards from SIGMOD/VLDB, the PhD dissertation award from UW-Madison CS, the IEEE TCDE Rising Star Award, an NSF CAREER Award, a Hellman Fellowship, a UCSD oSTEM Faculty of the Year Award, and research award gifts from Amazon, Google, Oracle, and VMware.

Curriculum Vitae | Research Blog | On Twitter | On Tumblr

Recent News

  • New! 3/21: Honored to be accorded the IEEE TCDE Rising Star Award! A special recognition I will cherish. Award acceptance talk: Video and Slides PDF.

  • New! 3/21: A big thank you to Amazon AI for supporting our work on SortingHat with an Amazon Research Award!

  • New! 3/21: The first full research paper on SortingHat is accepted to SIGMOD 2021! This is likely the first research paper in SIGMOD's history focused primarily on a new benchmark dataset. War is coming to ML platforms / AutoML land!

  • 2/21: A big thank you to VMware for supporting our work on Cerebro with a second faculty research award gift, generously tripling their support!

  • 2/21: I have launched a new Tumblr page for fun posts on research, practice, and education in the world of systems for data management, analytics, ML, and data science.

  • 1/21: Congrats to my undergraduate advisees, Advitya and Kabir, and PhD student, Side, for being selected for the second round of the SIGMOD 2021 Student Research Competition! It is pretty rare for three submissions from a single group to all get selected.

  • 12/20: Congrats to my undergraduate advisee, Kabir, on receiving an Honorable Mention for the CRA Outstanding Undergraduate Researcher Award! Kabir is in the PhD applicant pool for the 2020-21 cycle.


My current research focuses on the foundations of advanced data analytics systems that help make the process of building and deploying ML/AI-powered data analytics applications easier (improving the productivity of data scientists and ML/software engineers) and faster (improving runtime performance and introducing accuracy trade-offs). Thus, the key themes of my research are usability, developability, performance, and scalability. I enjoy working on problems that are motivated by real applications and are formally grounded. I also enjoy insightful conversations with practitioners on the frontlines of data analytics.

More details about my research are available on my research group webpage, including current projects, and all of our publications.




  • Tara Mirmira (PhD, CSE, USCD)

  • Supun Nakandala (PhD, CSE, UCSD)

  • Vraj Shah (PhD, CSE, UCSD)

  • Yutong Shao (PhD, CSE, UCSD); Primary advisor: Ndapa Nakashole

  • Yuhao Zhang (PhD, CSE, UCSD)

  • Xiuwen Zheng (PhD, CSE, USCD); Co-advisor: Amarnath Gupta

  • Side Li (MS, CSE, UCSD)

  • Advitya Gemawat (BS, HDSI, UCSD)

  • Kabir Nagrecha (BS, CSE, UCSD)

  • Shaoqing Yi (BS, HDSI and Math, UCSD)


  • Kevin Yang (BS, CSE, UCSD, 2020); First employment: MS at UPenn

  • David Justo (MS, CSE UCSD, 2019); Co-advisor: Nadia Polikarpova; First employment: Microsoft

  • Side Li (BS, CSE, UCSD, 2018); First employment: Amazon

  • Anthony Thomas (MS, CSE, UCSD, 2018); First employment: PhD at UCSD

  • Lingjiao Chen (MS, CS, UW-Madison, 2018); First employment: PhD at Stanford

  • Mingyang Wang (MS, CSE, UCSD, 2017); First employment: Amazon



  • Associate Editor, Scalable Data Science Category, VLDB 2022, 2021 (Inaugural)

  • Co-Chair, Diversity and Inclusion, ACM SIGMOD 2022, 2021 (Inaugural)

  • Core Committee member, Diversity & Inclusion in DB Initiative, 2021 (Inaugural)

  • (Inaugural) Lead Organizer, SoCal DB Day 2018

  • Co-Chair, ACM SIGMOD Workshop on Data Management for End-to-End Machine Learning (DEEM) 2018

  • (Inaugural) Organizing Committee, ACM SIGKDD Workshop on Common Model Infrastructure (CMI) 2018

  • Organizing Committee, Extremely Large Databases (XLDB) 2018

Program Committee:

  • VLDB: 2022, 2021, 2020, 2019, 2018

  • ACM SIGMOD: 2020, 2019, 2018, 2017

  • CIDR: 2021

  • ACM SIGMOD DEEM Workshop: 2021, 2020, 2019, 2017

  • MLSys / SysML: 2020, 2019

  • ACM SIGMOD 2017 Demonstrations; Student Research Competition

  • IEEE ICDE 2017

  • USENIX HotCloud 2016

  • ACM SIGMOD 2016 Undergraduate Research Poster Competition


  • ACM Transactions on Database Systems (TODS) 2017, 2015

  • IEEE Transactions on Knowledge and Data Engineering (TKDE) 2014

Outreach Materials

Blog Posts and Talks:

Interviews and Panels:

News and Other Resources: