Arun Kumar

Assistant Professor
Computer Science and Engineering
University of California, San Diego
Email: arunkk [at] eng [dot] ucsd [dot] edu
Office: 3218 EBU3B (CSE building)


Arun Kumar is an Assistant Professor in the Department of Computer Science and Engineering at the University of California, San Diego. He is a member of the Database Lab and an affiliate member of the AI Group and CNS. He obtained his PhD from the University of Wisconsin-Madison in 2016. His primary research interests are in data management and its intersection with machine learning/artificial intelligence, an area that is increasingly called advanced analytics or data science. Systems and ideas based on his research have been released as part of the MADlib open-source library, shipped as part of products from EMC, Oracle, Cloudera, and IBM, and used internally by Facebook, LogicBlox, and Microsoft. He is a recipient of the Best Paper Award at ACM SIGMOD 2014 and the 2016 Graduate Student Research Award for the best dissertation research in UW-Madison CS.

Curriculum Vitae | Research Blog | On Twitter

Recent News

  • New! The second edition of the DEEM workshop is coming to SIGMOD 2018! ML/AI is front and center in the future of data-driven applications. Submit your latest ideas on new data management techniques and systems to democratize ML/AI!

  • New! Visited and gave a talk at Teradata San Diego. Great to see the excitement around AI for data management!

  • New! Hamlet++ paper is accepted to VLDB 2018 (project webpage with paper/code/data)! Shakespeare vai para o Brasil!

  • Visited and gave a talk at UMichigan (thanks, Barzan!); excited to see all the work on data systems + ML + HCI!


My current research focuses on the foundations of advanced data analytics systems with the aim of making the end-to-end process of building and deploying ML/AI-powered data analytics applications easier (improving the productivity of data scientists and ML/software engineers) and faster (improving runtime performance and introducing accuracy trade-offs). Thus, the key themes of my research are usability, developability, performance, and scalability. I enjoy working on problems that are motivated by real applications and are formally grounded. I also enjoy insightful conversations with practitioners on the frontlines of data analytics.

More details about my research are available on my research group webpage: Advanced Data Analytics (ADA) Lab.

List of projects in my research group: ADALab Projects.

List of my publications: ADALab Publications.




  • Lingjiao Chen (PhD, UW-Madison; co-advised by Paris Koutris)

  • Supun Nakandala (PhD, UCSD)

  • Vraj Shah (MS, UCSD)

  • Anthony Thomas (MS, UCSD)

  • Yaobang Deng (BS, UCSD)

  • Side Li (BS, UCSD)


  • Mingyang Wang (MS, UCSD, 2017)

  • Fengan Li (MS, UW-Madison, 2016; First employment: Google)

  • Zhiwei Fan (BS, UW-Madison, 2016; Onward to MS, UW-Madison)

  • Fujie Zhan (BS, UW-Madison, 2016; First employment: Epic Systems)

  • Mona Jalal (MS, UW-Madison, 2015)

  • Boqun Yan (BS, UW-Madison, 2015; First employment: Google)



  • Co-Chair, ACM SIGMOD 2018 Workshop on Data Management for End-to-End Machine Learning (DEEM)

Program Committee:

  • ACM SIGMOD 2018

  • VLDB 2018

  • ACM SIGMOD 2017 (Research Track, Demonstrations, and Student Research Competition)

  • ACM SIGMOD 2017 Workshop on Data Management for End-to-End Machine Learning (DEEM)

  • IEEE ICDE 2017

  • USENIX HotCloud 2016

  • ACM SIGMOD 2016 Undergraduate Research Poster Competition


  • ACM Transactions on Database Systems (TODS) 2017, 2015

  • IEEE Transactions on Knowledge and Data Engineering (TKDE) 2014