Arun Kumar

Assistant Professor
Computer Science and Engineering
University of California, San Diego
Email: arunkk [at] eng [dot] ucsd [dot] edu
Office: 3218 EBU3B (CSE building)


Arun Kumar is an Assistant Professor in the Department of Computer Science and Engineering at the University of California, San Diego. He is a member of the Database Lab and CNS and an affiliate member of the AI Group. His primary research interests are in data management and data systems and their intersection with machine learning/artificial intelligence. Systems and ideas based on his research have been released as part of the MADlib open-source library, shipped as part of products from EMC, Oracle, Cloudera, and IBM, and used internally by Facebook, LogicBlox, and Microsoft. He is a recipient of the Best Paper Award at ACM SIGMOD 2014 and the 2016 Graduate Student Research Award for the best dissertation research in UW-Madison CS.

Curriculum Vitae | Research Blog | On Twitter

Recent News

  • New! Preprints of the Vista and SLAB papers are out! Also out is a preprint of the DAnA paper, a collaboration with Hadi Esmaeilzadeh.

  • New! We have added a multi-community panel to discuss and chart the future course of research on data management and systems for ML/AI-based analytics at the DEEM workshop at SIGMOD 2018! Submit your work on democratizing ML/AI by Mar 12!

  • I am on the organizing committee of XLDB this year. The theme is Data meets ML/AI.

  • Hamlet++ paper is accepted to VLDB 2018 (project webpage with paper/code/data)! Shakespeare vai para o Brasil!


My current research focuses on the foundations of advanced data analytics systems that help make the process of building and deploying ML/AI-powered data analytics applications easier (improving the productivity of data scientists and ML/software engineers) and faster (improving runtime performance and introducing accuracy trade-offs). Thus, the key themes of my research are usability, developability, performance, and scalability. I enjoy working on problems that are motivated by real applications and are formally grounded. I also enjoy insightful conversations with practitioners on the frontlines of data analytics.

More details about my research are available on my research group webpage, including current projects, and all of our publications.




  • Lingjiao Chen (PhD, UW-Madison; co-advised by Paris Koutris)

  • Supun Nakandala (PhD, UCSD)

  • Vraj Shah (MS, UCSD)

  • Anthony Thomas (MS, UCSD)

  • Yaobang Deng (BS, UCSD)

  • Side Li (BS, UCSD)


  • Mingyang Wang (MS, UCSD, 2017)

  • Fengan Li (MS, UW-Madison, 2016; First employment: Google)

  • Zhiwei Fan (BS, UW-Madison, 2016; Onward to MS, UW-Madison)

  • Fujie Zhan (BS, UW-Madison, 2016; First employment: Epic Systems)

  • Mona Jalal (MS, UW-Madison, 2015)

  • Boqun Yan (BS, UW-Madison, 2015; First employment: Google)

Technical Service


  • Co-Chair, ACM SIGMOD 2018 Workshop on Data Management for End-to-End Machine Learning (DEEM)

  • Organizing Committee, XLDB 2018

Program Committee:

  • ACM SIGMOD 2019, 2018

  • VLDB 2019, 2018

  • ACM SIGMOD 2017 (Research Track, Demonstrations, and Student Research Competition)

  • ACM SIGMOD 2017 Workshop on Data Management for End-to-End Machine Learning (DEEM)

  • IEEE ICDE 2017

  • USENIX HotCloud 2016

  • ACM SIGMOD 2016 Undergraduate Research Poster Competition


  • ACM Transactions on Database Systems (TODS) 2017, 2015

  • IEEE Transactions on Knowledge and Data Engineering (TKDE) 2014