Arun Kumar

Assistant Professor
Computer Science and Engineering and
Halicioglu Data Science Institute
University of California, San Diego
Email: arunkk [at] eng [dot] ucsd [dot] edu
Office: 3218 EBU3B (CSE building)


Arun Kumar is an Assistant Professor in the Department of Computer Science and Engineering at the University of California, San Diego. He is a member of the Database Lab and CNS and an affiliate member of the AI Group. His primary research interests are in data management and systems for machine learning/artificial intelligence-based data analytics. Systems and ideas based on his research have been released as part of the MADlib open-source library, shipped as part of products from EMC, Oracle, Cloudera, and IBM, and used internally by Facebook, LogicBlox, Microsoft, and other companies. He is a recipient of two SIGMOD research paper awards in 2019 and 2014, three distinguished reviewer awards from SIGMOD/VLDB in 2019 and 2017, the 2016 PhD dissertation award from UW-Madison CS, a 2018 Hellman Fellowship, a 2016 Google Faculty Research Award, and a 2019 Oracle Labs Research Award.

Curriculum Vitae | Research Blog | On Twitter

Recent News

  • New! 12/19: Congrats to Advitya on being named a recipient of the inaugural HDSI Undergraduate Scholarship!

  • New! 11/19: The Panorama paper is accepted to VLDB 2020. Looking to expand our horizons to Tokyo!

  • 09/19: Congrats to Side and Tara on receiving an HDSI PhD Fellowship! Congrats again to Side on also receiving a Jacobs School of Engineering PhD Fellowship!

  • 07/19: Gave a talk on data science careers to two groups of high school students, REHS summer workshop by SDSC and Big Data Summer Camp by QI at UCSD. PDF of slides

  • Morgan & Claypool publishes a book I co-authored with Matthias Boehm and Jun Yang, Data Management in ML Systems, the first book on the emerging area of ML systems (PDF on M&C webpage; order hard copy).


My current research focuses on the foundations of advanced data analytics systems that help make the process of building and deploying ML/AI-powered data analytics applications easier (improving the productivity of data scientists and ML/software engineers) and faster (improving runtime performance and introducing accuracy trade-offs). Thus, the key themes of my research are usability, developability, performance, and scalability. I enjoy working on problems that are motivated by real applications and are formally grounded. I also enjoy insightful conversations with practitioners on the frontlines of data analytics.

More details about my research are available on my research group webpage, including current projects, and all of our publications.




  • Side Li (PhD, CSE, UCSD)

  • Tara Mirmira (PhD, CSE, USCD)

  • Supun Nakandala (PhD, CSE, UCSD)

  • Vraj Shah (PhD, CSE, UCSD)

  • Yuhao Zhang (MS, CSE, UCSD)

  • Advitya Gemawat (BS, HDSI, UCSD)

  • Kevin Yang (BS, CSE, UCSD)


  • David Justo (BS, CSE UCSD, 2019); Co-advisor: Nadia Polikarpova

  • Side Li (BS, CSE, UCSD, 2018)

  • Anthony Thomas (MS, CSE, UCSD, 2018)

  • Lingjiao Chen (MS, CS, UW-Madison, 2018)

  • Mingyang Wang (MS, CSE, UCSD, 2017)

Research Service


  • Associate Editor, VLDB 2021

  • Lead Organizer, SoCal DB Day 2018

  • Co-Chair, ACM SIGMOD Workshop on Data Management for End-to-End Machine Learning (DEEM) 2018

  • Organizing Committee, ACM SIGKDD Workshop on Common Model Infrastructure (CMI) 2018

  • Organizing Committee, Extremely Large Databases (XLDB) 2018

Program Committee:

  • ACM SIGMOD: 2020, 2019, 2018, 2017

  • VLDB: 2021, 2020, 2019, 2018

  • SysML: 2020, 2019

  • ACM SIGMOD DEEM Workshop: 2020, 2019, 2017

  • ACM SIGMOD 2017 Demonstrations; Student Research Competition

  • IEEE ICDE 2017

  • USENIX HotCloud 2016

  • ACM SIGMOD 2016 Undergraduate Research Poster Competition


  • ACM Transactions on Database Systems (TODS) 2017, 2015

  • IEEE Transactions on Knowledge and Data Engineering (TKDE) 2014