Hao Su

Assistant Professor at UC San Diego

Bldg EBU3B #4114
Dept. of Computer Science and Engineering
UC San Diego, La Jolla, USA

haosu AT eng.ucsd.edu / bio / google scholar / publication

Artificial Intelligence Group, Center for Visual Computing,
Halicioğlu Data Science Institute, Contextual Robotics Institute


Research Statement

My lab (SU Lab) works on Structure Understanding, Shape Understanding, and Scene Understanding problems that are relevant to artificial intelligence. I am interested in both theories and algorithms to address the problems. My publications are distributed in machine learning, computer vision, computer graphics, and robotics journals/conferences.

I am leading the construction of ShapeNet, a large-scale 3D-centric knowledge base of objects (SGP Dataset Award), and used to work on ImageNet, a large-scale 2D image database (PAMI Mark Everingham Prize).

Applications of my research include robotics, autonomous driving, virtual/augmented reality, smart manufacturing, etc.

Research Overview

Computer Vision and Computer Graphics
  • Joint Analysis of 2D Images and 3D Shapes
  • Crowd-sourcing for Large-scale Dataset Construction
  • Scene Understanding
  • Deep Understanding
Statistics and Optimization
  • Large-scale Optimization
  • Large-scale Graph Analysis
  • Multivariate Density Estimation


Introductory course of computer vision for junior undergraduate students.
revamped version of the previous 3D ML course with more references to classical materials of geometry.
Broad topics in deep learning, shape analysis, and 3D machine learning are covered.
A co-organized tutorial to 3D deep learning techniques, given at CVPR2017 held in Hawaii. Some representive work of CV/CG community in the recent years on this topic are selected and discussed. This is extended from a graduate course I taught in the spring quarter of 2017 at Stanford. Many thanks to my colleagues.

Talk slides

SU Lab Research Report of 2018-2019 (Understanding 3D Environments for Interactions)
Edited from Invited talks for CVPR2019/RSS2019 (updated on July 3, 2019)
The mission and big picture of research happening in SU Lab --- learning to interact with the environment. It describes the extension of SU Lab's research focus from deep 3D representation learning to broader topics of artifical intelligence for interacting with the environment. Not all papers published in the year are included in the report. Missing topics are binary neural networks and adversarial defense.
Towards Attack-Agnostic Defense for 2D and 3D Recognition
Invited talk at the Workshop of AdvML in CVPR2019 (updated on July 3, 2019)
A summary of the work on 2D/3D adversarial defense in 2018-2019. The main messages are: (1) Lower-dimensional data seems to be easier to defend; and (2)Defending in lower resolution seems to be more attack agnostic.
Introduction to 3D Deep Learning
Invited talk at GRASP Lab of UPenn (updated on March 1, 2018)
A very quick introduction to 3D deep learning, primarily based upon my own work (pardon me if I skipped some imporant historial work (e.g., Shape from Shading), or missed your important and interesting recent papers). A much more comprehensive version can be found in the course section.
3D Deep Learning on Geometric Forms
Invited talk at NIPS workshop on 3D Deep Learning (updated in Dec, 2016)
How to consume or generate irregular representations for networks, such as sets of points and geometric primitives. Based upon 3 latest papers of mine.
Synthesize for Learning
Invited talk at 3DV workshop on Understanding 3D and Visuo-Motor Learning (updated in Sep, 2016)
Use synthetic data to train learning algorithms for applications such as viewpoint estimation, human pose estimation, and robot perception. Based upon 5 recent papers of mine.


New [Robo] S4G: Amodal Single-view Single-Shot SE(3) Grasp Detection in Cluttered Scenes
Yuzhe Qin*, Rui Chen*, Hao Zhu, Meng Song, Jing Xu, Hao Su
CoRL 2019
We studied the problem of 6-DoF grasping by a parallel gripper in a cluttered scene captured using a commodity depth sensor from a single view point. Our learning based approach trained in a synthetic scene can work well in real-world scenarios, with improved speed and success rate compared with state-of-the-arts.
New [ML] Mapping State Space using Landmarks for Universal Goal Reaching
Zhiao Huang*, Fangchen Liu*, Hao Su
NeurIPS 2019
Learning a structured model and combining it with RL algorithms are important for reasoning and planning over long horizons. We propose a sample-based method to dynamically map the visited state space and demonstrate its empirical advantage in routing and exploration in several challenging RL tasks, including the control of locomotion and robot gripper, navigation, and agent-object interaction.
New [CG] StructureNet: Hierarchical Graph Networks for 3D Shape Generation
Kaichun Mo*, Paul Guerrero*, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas
We introduce a hierarchical graph network for learning structure-aware shape generation which (i) can directly encode shape parts represented as such n-ary graphs; (ii) can be robustly trained on large and complex shape families such as PartNet; and (iii) can be used to generate a great diversity of realistic structured shape geometries with both both continuous geometric and discrete structural variations.
New [CV] Point-based Multi-view Stereo Network
Rui Chen*, Songfang Han*, Jing Xu, Hao Su
ICCV 2019 (oral)
An iterative framework to predict the depth of a scene using point cloud representation from multiple views. Use deep learning over the kNN graph to predict the residual for geometry estimation refinement.
New [ML] Extending Adversarial Attacks AND Defenses To Deep 3D Point Cloud Classifiers
Daniel Liu (Torry Pines High School), Ronald Yu, Hao Su
ICIP 2019
An experiment report on adversarial attack and defense over PointNet. I am proud that Daniel finished the whole paper almost all by himself, with marginal supervision from Ronald and me.
New [SYS] Towards Fast and Energy-Efficient Binarized Neural Network Inference
Cheng Fu, Shilin Zhu, Hao Su, Ching-En Lee, Jishen Zhao
FPGA 2019
An FPGA design to improve the inference of binary neural network.
New [CG] Deep View Synthesis from Sparse Photometric Images
Zexiang Xu, Sai Bi, Kalyan Sunkavalli, Sunil Hadap, Hao Su, Ravi Ramamoorthi
In this paper, we synthesize novel viewpoints across a wide range of viewing directions (covering a 60 deg cone) from a sparse set of just six viewing directions.
New [ML] Transfer value or policy? A Value-Centric Framework Towards Transferrable Continuous Reinforcement Learning
Xingchao Liu*, Tongzhou Mu*, Hao Su
How to build sample-efficient transfer learning algorithms in the continuous control setting? This paper shows that the commonly used policy-based methods are prone to getting stuck in local minimums, while value-based methods can converge much faster when transferred to a new environment.
New [CV&ML] A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks
Yinghao Xu, Xin Dong, Yudian Li, Hao Su
CVPR 2019
A simple learning-based binary neural network pruning scheme.
New [CV] PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding
Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su
CVPR 2019
A 3D object database with fine-grained and hierarchical part annotation. To assist segmentation and affordance research.
New [CV] ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving
Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yuchao Dai, Hao Su, Hongdong Li, Ruigang Yang
CVPR 2019
The first large-scale database suitable for 3D car instance understanding, ApolloCar3D, collected by Baidu. The dataset contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints.
New [ML&CV] Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?
Shilin Zhu, Xin Dong, Hao Su
CVPR 2019
Ensemble of binary neural networks has better stability and robustness, and may perform as well as floating-point networks.
New [ML&CV] Adversarial Defense by Stratified Convolutional Sparse Coding
Bo Sun, Nian-hsuan Tsai, Fangchen Liu, Ronald Yu, Hao Su
CVPR 2019
An attack-agnostic defense mechanism for neural networks.
New [CV&CG&ML] Deep 3D Representation Learning
Hao Su
Ph.D. thesis (Best Ph.D. Thesis Award Honorable Mention by ACM SIGGRAPH)
A summary of my work on 3D deep learning between 2014 and 2018. I also include a projection of the future directions in this field and some open problems in my mind.
[CV&CG] DeepSpline: Data-Driven Reconstruction of Parametric Curves and Surfaces
Jun Gao, Chengcheng Tang, Vignesh Ganapathi-Subramanian, Jiahui Huang, Hao Su, Leonidas Guibas
Traditional approach for spline fitting requires a strong intialization. In this work, we explore a deep learning based approach to address the challenge. This work and the DeepPrimitive work (below) are internship projects of Jiahui Huang (currently Ph.D. at Tsinghua) and Jun Gao (currently at UToronto) to explore how we convert an image to a vector graph. However, the two papers have not fully addressed the challenges yet.
[CV&CG] DeepPrimitive: Image decomposition by layered primitive detection
Jiahui Huang, Jun Gao, Vignesh Ganapathi-Subramanian, Hao Su, Yin Liu, Chengcheng Tang, Leonidas Guibas
Computational Visual Media 2018, Vol 4
Build a geometric interpretation of an image by discovering simple primitives and generating a layer-wise representation.
[CG&ML] Deep Part Induction from Articulated Object Pairs
Li Yi, Haibin Huang, Difan Liu, Evangelos Dalogerakis, Hao Su, Leonidas Guibas
SIGGRAPH Asia 2018
A weakly-supervised method to discover 3D object parts driven by functionality. Understand the structures of objects from a pair of similar instances under different object articulation states. An ICP-like deep learning based framework.
[ML] Deep Functional Dictionaries: Learning Consistent Semantic Structures on 3D Models from Functions
Minhyuk Sung, Hao Su, Ronald Yu, Leonidas Guibas
NIPS 2018
A weakly-supervised method to discover 3D object parts by neural networks, leveraging the implicit consistency induced by deep neural networks.
[CV] Geometry-Guided CNN for Self-supervised Video Representation Learning
Chuang Gan, Boqing Gong, Kun Liu, Hao Su, Leonidas Guibas
CVPR 2018
To address the training data scarcity problem in video representation learning, we explore geometry cues, which can be acquired without cost, as an auxiliary signals for semantic understanding.
[CV] View Extrapolation of Human Body from a Single Image
Hao Zhu, Hao Su, Peng Wang, Xun Cao, Ruigang Yang
CVPR 2018
Human bodies are deformable shapes and applying existing approaches to synthesize novel views of human bodies from a single view is difficult. This paper disentangles novel-view synthesis as a depth estimation problem and a geometry-based flow prediction problem.
[CV] Beyond Holistic Object Recognition: Enriching Image Understanding with Part States
Cewu Lu, Hao Su, Yongyi Lu, Li Yi, Chikeung Tang, Leonidas Guibas
CVPR 2018
Introduce the concept and computational model of "part state", an intermediate representation for object interaction modeling and image captioning. A dataset is provided, as well.
[CV] Frustum PointNets for 3D Object Detection from RGB-D Data
Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, Leonidas Guibas
CVPR 2018
One of PointNet series of work, focusing on amodal 3D object detection and instance segmentation. Leading performance on KITTI 3D object detection benchmark (RGB-D data for autonomous driving) as of Nov 17, 2017.
[CV] Cross-modal Attribute Transfer for Rescaling 3D Models
Lin Shao, Angel X. Chang, Hao Su, Manolis Savva, Leonidas Guibas
3DV 2017
Transfer geometrical and physical attributes from product catelog to 3D models in ShapeNet.
[CV&ML] PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas
NIPS 2017
Deep network architecture for processing point cloud of cluttered scenes that often have non-uniform sampling density. Build upon PointNet, our CVPR2017 paper.
[CG] ComplementMe: Weakly-Supervised Component Suggestions for 3D Modeling
Minhyuk Sung, Hao Su, Vladimir G. Kim, Siddhartha Chaudhuri, and Leonidas Guibas
SIGGRAPH Asia 2017
Deep learning based approach for part-based 3D model synthesis. Given a partial construction (e.g., a chair in design), this method proposes a new component (e.g., arm) that is compatible with existing shape in style.
[CG] Learning Hierarchical Shape Segmentation and Labeling from Online Repositories
Li Yi, Leonidas J. Guibas, Aaron Hertzmann, Vladimir G. Kim, Hao Su, Ersin Yumer
Learn a consistent part hierarchy from a large collection of 3D models with scene-graph structure.
[CV] A Point Set Generation Network for 3D Object Reconstruction from a Single Image
Hao Su*, Haoqiang Fan*, Leonidas Guibas
CVPR 2017 (oral)
Build a generative neural network to directly output a set of unordered points. As applications, it can be used for single-image based 3D reconstruction and shape completion.
[CV] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
Hao Su*, Charles Qi*, Kaichun Mo, Leonidas Guibas
CVPR 2017 (oral)
Build a neural network to directly consume an unordered point cloud as input, without converting to other 3D representations such as voxel grids first. Rich theoretical and empirical analyses are provided.
[CV] SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation
Li Yi, Hao Su, Xingwen Guo, Leonidas Guibas
CVPR 2017 (spotlight)
A convolutional neural network on generic graphs of non-isometric structures. Spectral analysis (spectral domain synchronization) is conducted to enable effective kernel weight sharing. Part segmentation as an application.
[CV] Learning Shape Abstractions by Assembling Volumetric Primitives
Shubham Tulsiani, Hao Su, Leonidas Guibas, Alexei A. Efros, Jitendra Malik
CVPR 2017
Learn to abstract polygonal meshes by a flexible number of simple primitives such as cuboids. The abstraction is category consistent.
[CV] Learning Non-Lambertian Object Intrinsics across ShapeNet Categories
Jian Shi, Yue Dong, Hao Su, Stella X. Yu
CVPR 2017
Show that the material attributes of ShapeNet models can be useful to train algorithms for understanding the material and optical properties in Internet photos.
[CV] Volumetric and Multi-View CNNs for Object Classification on 3D Data
Hao Su*, Charles Qi*, Matthias Niessner, Angela Dai, Mengyuan Yan, Leonidas Guibas
CVPR 2016 (spotlight oral)
Novel architectures for 3DCNNs that take volumetric or multi-view representations as input.
[CV] Multilinear Hyperplane Hashing
Xianglong Liu, Xinjie Fan, Cheng Deng, Hao Su, Dacheng Tao
CVPR 2016
Efficient approximate point-to-plane search.
[CV] Synthesizing Training Images for Boosting Human 3D Pose Estimation
Wenzheng Chen, Huan Wang, Yangyan Li, Hao Su, Zhenhua Wang, Chenghe Tu, Dani Lischinski, Daniel Cohen-Or, Baoquan Chen
3DV 2016 (oral)
Extend RenderForCNN (my ICCV'15 paper) for 3D human pose estimation focusing on domain adaptation and data augmentation by automatic texture transfer.
[CV] ObjectNet3D: A Large Scale Database for 3D Object Recognition
Yu Xiang, Wonhui Kim, Wei Chen, Jingwei Ji, Christopher Choy, Hao Su, Roozbeh Mottaghi, Leonidas Guibas, Silvio Savarese
ECCV 2016 (spotlight oral)
A large-scale image-shape database by linking ImageNet and ShapeNet at instance level.
[CG] 3D Attention-Driven Depth Acquisition for Object Identification
Kai Xu, Yifei Shi, Lintao Zheng, Junyu Zhang, Min Liu, Hui Huang, Hao Su, Daniel Cohen-Or, Baoquan Chen
Transactions on Graphics (SIGGRAPH ASIA 2016)
Teach robots to identify objects with fewest movements and scans. We trained a 3D attention model by reinforcement learning.
[CG] Unsupervised Texture Transfer from Images to Model Collections
Tuanfeng Y. Wang, Hao Su, Qixing Huang, Jingwei Huang, Leonidas J. Guibas, Niloy J. Mitra
Transactions on Graphics (SIGGRAPH ASIA 2016)
Transfer textures from product images to 3D shapes. The increased texture variation in ShapeNet is validated to be effective for RenderForCNN (my ICCV'15 paper).
[CG] A Scalable Active Framework for Region Annotation in 3D Shape Collections
Li Yi, Vladimir G. Kim, Duygu Ceylan, I-Chao Shen, Mengyuan Yan, Hao Su, Cewu Lu, Qixing Huang, Alla Sheffer, Leonidas Guibas
Transactions on Graphics (SIGGRAPH ASIA 2016)
Annotate the parts for ShapeNet by crowd-sourcing and label propagation with high efficiency and accuracy.
SHREC'16 Track: Large-Scale 3D Shape Retrieval from ShapeNet Core55
M. Savva, F. Yu, Hao Su, M. Aono, B. Chen, D. Cohen-Or, W. Deng, H. Su, S. Bai, X. Bai, N. Fish, J. Han, E. Kalogerakis, E. G. Learned-Miller, Y. Li, M. Liao, S. Maji, A. Tatsuma, Y. Wang, N. Zhang, Z. Zhou
EuroGraphics SHREC2016 Workshop Report
Technical report for SHREC'16, the most renowned challenge for 3D shape retrieval.
[CV&CG] ShapeNet: An Information-Rich 3D Model Repository
Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva*, Shuran Song, Hao Su*, Jianxiong Xiao, Li Yi, Fisher Yu
Corresponding author, student co-lead, arxiv, 2016
The official report of ShapeNet, an object-centric database of semantics, geometry and physics.
[CV] 3D-Assisted Image Feature Synthesis for Novel Views of an Object
Hao Su*, Fan Wang*, Li Yi, Leonidas Guibas
ICCV 2015 (oral, acceptance rate: 2%)
Synthesize features at novel views of a 3D object from the observed viewpoint, leveraging on the geometric priors from ShapeNet.
[CV] Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views
Hao Su*, Charles Qi*, Yangyan Li, Leonidas Guibas
ICCV 2015 (oral, acceptance rate: 2%)
Show that large-scale synthetic data rendered from virtual world may greatly benefit deep learning to work in real world. Deliver a state-of-the-art viewpoint estimator.
[CG] Joint Embeddings of Shapes and Images via CNN Image Purification
Hao Su*, Yangyan Li*, Charles Qi, Noa Fish, Daniel Cohen-Or, Leonidas Guibas
Transactions on Graphics (SIGGRAPH Asia 2015)
Cross-modality learning of 3D shapes and 2D images by neural networks. A joint embedding space that is sensitive to 3D geometry difference but agnostic to other nuisances is constructed.
[CV] ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei
IJCV 2015
The technical report for ImageNet Challenge.
[CG] Estimating Image Depth using Shape Collections
Hao Su, Qixing Huang, Niloy Mitra, Yangyan Li, Leonidas Guibas
Transactions on Graphics (SIGGRAPH 2014)
Learn to estimate the depth from a single input image assisted by geometric priors from a 3D shape collection (later merged to ShapeNet).
[CG] Fine-Grained Semi-Supervised Labeling of Large Shape Collections
Qixing Huang, Hao Su, Leonidas Guibas
Transactions on Graphics (SIGGRAPH Asia 2013)
Fine-grained 3D shape classification.
[CV] Multi-level structured image coding on high-dimensional image representation
Li-Jia Li*, Jun Zhu*, Hao Su, Eric. P. Xing, Li Fei-Fei
ACCV 2013
Multi-layer sparse coding for compressing ObjectBank representation.
[CV] Crowd-sourcing Annotations for Visual Object Detection
Hao Su, Jia Deng, Li Fei-Fei
AAAI 2012 Human Computation Workshop
A system to annotate object bounding boxes for ImageNet by crowd-sourcing. This system is used to collect bounding boxes for ImageNet Large Scale Visual Recognition Challenges.
[CV] Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification
Hao Su*, Li-Jia Li*, Eric.P. Xing, Li Fei-Fei
NIPS 2010 (top 10 most cited paper in NIPS since 2010)
Learn to describe scenes by responses from object detectors. Can be viewed as a layer-wise trained CNN (Gradient-HoG-Part-Object-Scene hierarchy).
[CV] Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories
Hao Su*, Min Sun*, Li Fei-Fei, Silvio Savarese
ICCV 2009 (oral, acceptance rate: 4%)
Continuous viewpoint estimation by a graphical model.
[CV] A Multi-View Probabilistic Model for 3D Object Classes
Hao Su*, Min Sun*, Li Fei-Fei, Silvio Savarese
CVPR 2009
Discrete viewpoint estimation by a graphical model.
[CV] Construction and Analysis of a Large Scale Image Ontology
Jia Deng, Hao Su, Minh Do, Kai Li, Li Fei-Fei
VSS 2009
ImageNet analysis paper.

Statistics and Optimization

[CV&ML] FPNN: Field Probing Neural Networks for 3D Data
Yangyan Li, Soeren Pirk, Hao Su, Charles R. Qi, Leonidas J. Guibas
NIPS 2016
A very efficient 3D deep learning method for volumetric data processing that takes advantage of data sparsity in 3D fields.
[ML] Density Estimation via Discrepancy
Kun Yang, Hao Su, Wing Wong
arXiv:1509.06831, 2015
Estimating the density of a population by adaptively partitioning the space according to discrepancy criteria.
[ML] co-BPM: a Bayesian Model for Estimating Divergence and Distance of Distributions
Kun Yang, Hao Su, Wing Wong
Journal of Computational and Graphical Statistics
Measuring discrepancy of two samples by a Bayesian approach.
[DB&ML] Reverse Top-k Search using Random Walk with Restart
Adams Wei Yu, Nikos Mamoulis, Hao Su
VLDB 2014
Given a node in a large-scale transition graph, how to efficiently find those nodes that have this given node as a top k nearest neighbour (reverse top-k search problem). The inverse of PageRank problem.
[ML] Efficient Euclidean Projections onto the Intersection of Norm Balls
Hao Su*, Adams W. Yu*, Li Fei-Fei
ICML 2012
Sparse-group LASSO model is a linear regression model that encourages simultaneous element-wise sparsity and group-wise sparsity. This work studies the key component in optimizing such a model by projection-based algorithms.


[GEO] Pathlet Learning for Compressing and Planning Trajectories
Chen Chen, Hao Su, Qixing Huang, Lin Zhang, Leonidas Guibas
Discover common sub-structures from a large set of taxi trajectories. Formulated as a linear programming problem, in a spirit similar to probabilistic topic models (pLSA).

Academic calendar

Click here