Full List of Publications

  • Stop That Join! Discarding Dimension Tables when Learning High Capacity Classifiers
    Vraj Shah, Arun Kumar, and Xiaojin Zhu
    Under submission [PDF on ArXiv]
  • SpeakQL: Towards Speech-driven Multi-modal Querying
    Dharmil Chandarana, Vraj Shah, Arun Kumar, and Lawrence Saul
    ACM SIGMOD 2017 HILDA Workshop (To Appear) [PDF]
  • Model-based Pricing: Do Not Pay for More than What You Learn!
    Lingjiao Chen, Paraschos Koutris, and Arun Kumar
    ACM SIGMOD 2017 DEEM Workshop (To Appear) [PDF]
  • Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics
    Xi Wu, Fengan Li, Arun Kumar, Kamalika Chaudhuri, Somesh Jha, and Jeffrey Naughton
    ACM SIGMOD 2017 (To Appear) [PDF on ArXiv]
  • Towards Linear Algebra over Normalized Data
    Lingjiao Chen, Arun Kumar, Jeffrey Naughton, and Jignesh Patel
    [PDF on ArXiv]
  • Data Management in Machine Learning: Challenges, Techniques, and Systems
    Arun Kumar, Matthias Boehm, and Jun Yang
    ACM SIGMOD 2017 Tutorial (To Appear)
  • Towards Linear Algebra over Normalized Data
    Lingjiao Chen, Arun Kumar, Jeffrey Naughton, Jignesh M. Patel
    Pre-print on arXiv [TechReport]
  • Cerebro: A System to Manage Deep Learning for Relational Data Analytics
    Arun Kumar
    CIDR 2017 (Abstract) [Paper]
  • Learning Over Joins
    PhD Dissertation. UW-Madison 2016 [PDF] [Talk at UCSD]
    Wisconsin CS 2016 Graduate Student Research Award for best dissertation research
  • To Join or Not to Join? Thinking Twice about Joins before Feature Selection
    Arun Kumar, Jeffrey Naughton, Jignesh M. Patel, and Xiaojin Zhu
    ACM SIGMOD 2016 [Paper] [Tech Report] [Code and Data]
  • Model Selection Management Systems: The Next Frontier of Advanced Analytics
    Arun Kumar, Robert McCann, Jeffrey Naughton, and Jignesh M. Patel
    ACM SIGMOD Record Dec 2015 (Vision Track) [Paper]
  • A Survey of the Existing Landscape of ML Systems
    Arun Kumar, Robert McCann, Jeffrey Naughton, and Jignesh M. Patel
    UW-Madison Technical Report TR1827 [Paper]
  • Demonstration of Santoku: Optimizing Machine Learning over Normalized Data
    Arun Kumar, Mona Jalal, Boqun Yan, Jeffrey Naughton, and Jignesh M. Patel
    VLDB 2015 (Demo) [Paper] [Code and Data]
  • Learning Generalized Linear Models Over Normalized Data
    Arun Kumar, Jeffrey Naughton, and Jignesh M. Patel
    ACM SIGMOD 2015 [Paper] [Code]
  • Materialization Optimizations for Feature Selection Workloads
    Ce Zhang, Arun Kumar, and Christopher Ré
    ACM SIGMOD 2014 [Paper]
    Best Paper Award; Invited to ACM TODS 2016
  • Distributed and Scalable PCA in the Cloud
    Arun Kumar, Nikos Karampatziakis, Paul Mineiro, Markus Weimer, and Vijay Narayanan
    NIPS BigLearn 2013 [Paper]
  • Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System
    Pradap Konda, Arun Kumar, Christopher Ré, and Vaishnavi Sashikanth
    VLDB 2013 (Demo) [Paper]
  • Hazy: Making it Easier to Build and Maintain Big-data Analytics
    Arun Kumar, Feng Niu, and Christopher Ré
    ACM Queue, 2013 [Paper]
    Invited to the CACM March 2013
  • Brainwash: A Data System for Feature Engineering
    Michael Anderson, Dolan Antenucci, Victor Bittorf, Matthew Burgess, Michael Cafarella, Arun Kumar, Feng Niu, Yongjoo Park, Christopher Ré, and Ce Zhang
    CIDR 2013 (Vision Track) [Paper]
  • Towards a Unified Architecture for in-RDBMS Analytics
    Xixuan Feng*, Arun Kumar*, Benjamin Recht, and Christopher Ré
    ACM SIGMOD 2012 [Paper] [Tech Report] [Code and Data]
  • The MADlib Analytics Library or MAD Skills, the SQL
    Joseph M. Hellerstein, Christopher Ré, Florian Schoppmann, Daisy Zhe Wang, Eugene Fratkin, Aleksander Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li, and Arun Kumar
    VLDB 2012 (Industrial Track) [Paper]
  • Probabilistic Management of OCR Data using an RDBMS
    Arun Kumar, and Christopher Ré
    VLDB 2012 [Paper] [Tech Report] [Code and Data]
  • On Reducing Delay in Mobile Data Collection-based WSNs
    Arun K. Kumar, Krishna M. Sivalingam, and Adithya Kumar
    Springer Wireless Networks 2012 [Paper]
  • Flexible Multimedia Content Retrieval Using InfoNames
    Arun Kumar, Ashok Anand, Athula Balachandran, Vyas Sekar, Aditya Akella, and Srinivasan Seshan
    ACM SIGCOMM 2010 (Demo) [Paper]
  • InfoNames: An Information-Based Naming Scheme for Multimedia Content
    Arun Kumar, Athula Balachandran, Vyas Sekar, Aditya Akella, and Srinivasan Seshan
    UW-Madison Technical Report TR1677 [Paper]
  • Energy-Efficient Mobile Data Collection in WSNs with Delay Reduction using Wireless Communication
    Arun K. Kumar, and Krishna M. Sivalingam
    IEEE/ACM COMSNETS 2010 [Paper]