PhD Computer Science
Linkedin | Google Scholar
Computer Science and Engineering
University of California, San Diego
Advisors: Rajesh Gupta & Dean Tullsen
I am currently working with Google on MLIR-based code-generation for NVIDIA GPUs. Prior to Google, I have also worked with NVIDIA, AMD Research and Qualcomm. I have 15+ years of professional experience. Please find a list of selected projects and public source code listings I contributed to:
OpenXLA (The OpenXLA Project brings together a community of developers and leading AI/ML teams to accelerate ML.)
Improved half- (F16) and single-precision (F32) OpenXLA codegen performance targeting NVIDIA A100 Tensor core by 1.65x and 1.53x, respectively [Slides | Talk]
NVIDIA/CUTLASS (CUDA Templates for Linear Algebra Subroutines) [BibTex]
CUDA kernel development in CUTLASS to enable deep learning primitives
Convolution (forward and backward) kernel development for NVIDIA Ampere, Turing, and Volta architecture targeting tensor cores
Gaussian complex GEMMs using 3m complex multiply algorithm targeting NVIDIA Ampere DMMA.884 tensor operations for F64 data type [Source code]
Complex GEMMs for F32 data targeting NVIDIA Ampere HMMA.1688 tensor operations for Tensor Float 32 (TF32) data type [Source code]
Warp Matrix Multiply Accumulate for F16, S8 and S4 data types targetting NVIDIA Turing architecutre [Source code]
Single-stage Matrix-Multiply Accumulate (MMA) pipeline [Source code]
LLVM-inspired data structures to store operations' compile-time configuration and run-time arguments [Source code]
Cutlass profiler to ensure functional correctness and measure performance of GEMM operations [Source code]
Please see CV for more details
I finished Ph.D. in Computer Science from University of California, San Diego. My expertise is in GPUs, computer architecture, and compilers. The following is a list of my publications and patents:
Manish Gupta (UC San Diego 2017)
Manish Gupta, Vilas Sridharan, David Roberts, Andreas Prodromou, Ashish Venkat, Dean Tullsen and Rajesh Gupta. In High-Performance Computer Architecture (HPCA 2018) [ PPT | Talk | BibTeX ]
Manish Gupta, Daniel Lowell, John Kalamatianos, Steven Raasch, Vilas Sridharan, Dean Tullsen, Rajesh Gupta. In Design Automation Conference (DAC 2017) [ PPT | Talk | BibTeX]
Manish Gupta, Abbas Rahimi, Daniel Lowell, John Kalamatianos, Dean Tullsen, Rajesh Gupta. In Silicon Errors in Logic – System Effects (SELSE 2017) [ PPT | Talk | BibTeX ]
Manish Gupta, David Roberts, Mitesh Meswani, Vilas Sridharan, Dean Tullsen, Rajesh Gupta. In International Symposium on Memory Systems (MEMSYS 2016) [ PPT | Talk | BibTeX ]
Alan Leung, Manish Gupta, Yuvraj Agarwal, Rajesh Gupta, Ranjit Jhala, Sorin Lerner. In Programming Language Design and Implementation (PLDI 2012) [BibTeX]
I have multiple patents which are waiting to be approved at United States Patent and Trademark Office (USPTO). The following is a selected list of my patents:
Manish Gupta, David Roberts, Mitesh Meswani, Vilas Sridharan, Steven Raasch, Daniel Lowell
Manish Gupta, David Roberts, Vilas Sridharan
Manish Gupta, Daniel Lowell
Manish Gupta, Daniel Lowell
Basic Data Structures & OO Design, Teaching Assistant, Fall 2015. [ Eval Section A00, Eval Section B00]
Software for Embedded Systems, Teaching Assistant, Spring 2014.
Email: mgupta dot iitr at gmail dot com
Office: NVIDIA Endeavor
2788 San Tomas Expy , CA 95051