Arun Kumar

Assistant Professor
Computer Science and Engineering
University of California, San Diego
Email: arunkk [at] eng [dot] ucsd [dot] edu
Office: 3218 EBU3B (CSE building)

Note: I am looking for excellent PhD and MS students to join my research group in 2019. We have several exciting and cutting-edge projects on data management and systems for machine learning-based analytics. Please see my research page for details.

PhD applicants: Make sure to apply to UCSD CSE by mid-December. PhD admission decisions are handled by a CSE committee; individual faculty cannot make offers. Explain in your application about your research experience and why you want to join my group.

MS students at UCSD: Sign up for my CSE 291 in Winter 2019 (2018 webpage), if you satisfy the pre-requisites. Students that perform outstandingly in this course and the associated research project will be considered for funding to continue their research.


Arun Kumar is an Assistant Professor in the Department of Computer Science and Engineering at the University of California, San Diego. He is a member of the Database Lab and CNS and an affiliate member of the AI Group. His primary research interests are in data management and systems for machine learning/artificial intelligence-based data analytics. Systems and ideas based on his research have been released as part of the MADlib open-source library, shipped as part of products from EMC, Oracle, Cloudera, and IBM, and used internally by Facebook, LogicBlox, Microsoft, and other companies. He is a recipient of the ACM SIGMOD 2014 Best Paper Award, the 2016 Graduate Student Research Award for the best dissertation research in UW-Madison CS, a 2016 Google Faculty Research Award, and a 2018 Hellman Fellowship.

Curriculum Vitae | Research Blog | On Twitter

Recent News

  • New! The Nimbus and Tuple-Oriented Compression papers are both accepted to SIGMOD 2019! I amsterdam and all that.

  • New! Represented CSE/UCSD at the oSTEM National Conference. Excited to see the high interest in computer science and data science! Also happy to spread the word on USCD's spectacular resources and efforts on inclusivity of LGBTQ+ people.

  • The inaugural edition of SoCal DB Day was a big success! Thank you to all the participating schools and companies.

  • A blog post on the panel discussion I moderated at SIGMOD DEEM Workshop 2018 is now live on the ACM SIGMOD Blog!

  • The SLAB paper is accepted to VLDB 2018 (or 2019?). Hit your ML system with SLAB to prove it is worthy!

  • A big thank you to NSF for funding Project SpeakQL!


My current research focuses on the foundations of advanced data analytics systems that help make the process of building and deploying ML/AI-powered data analytics applications easier (improving the productivity of data scientists and ML/software engineers) and faster (improving runtime performance and introducing accuracy trade-offs). Thus, the key themes of my research are usability, developability, performance, and scalability. I enjoy working on problems that are motivated by real applications and are formally grounded. I also enjoy insightful conversations with practitioners on the frontlines of data analytics.

More details about my research are available on my research group webpage, including current projects, and all of our publications.




  • Supun Nakandala (PhD, UCSD)

  • Vraj Shah (PhD, UCSD)

  • Yuhao Zhang (MS, UCSD)

  • Kevin Yang (BS, UCSD)


  • Lingjiao Chen (MS, UW-Madison, 2018)

  • Side Li (BS, UCSD, 2018)

  • Anthony Thomas (MS, UCSD, 2018)

  • Mingyang Wang (MS, UCSD, 2017)

Technical Service


  • Lead Organizer, SoCal DB Day 2018

  • Co-Chair, ACM SIGMOD 2018 Workshop on Data Management for End-to-End Machine Learning (DEEM)

  • Organizing Committee, ACM SIGKDD 2018 Workshop on Common Model Infrastructure (CMI)

  • Organizing Committee, Extremely Large Databases (XLDB) 2018

Program Committee:

  • SysML 2019

  • ACM SIGMOD 2019, 2018

  • VLDB 2019, 2018

  • ACM SIGMOD 2017 (Research Track, Demonstrations, and Student Research Competition)

  • ACM SIGMOD 2017 Workshop on Data Management for End-to-End Machine Learning (DEEM)

  • IEEE ICDE 2017

  • USENIX HotCloud 2016

  • ACM SIGMOD 2016 Undergraduate Research Poster Competition


  • ACM Transactions on Database Systems (TODS) 2017, 2015

  • IEEE Transactions on Knowledge and Data Engineering (TKDE) 2014