Talk at Qualcomm by Hao Su, Jiayuan Gu, and
Minghua Liu. (March 31st, 2020)
Tutorial on datasets, classification, segmentation, detection, and
reconstruction in 3D deep learning.
Learning-based 3D Capturing
Talk at Qualcomm by Rui Chen, Songfang Han, and
Shuo Cheng. (March 31st, 2020)
Multi-view Stereo (MVS) is playing an increasingly important role in
various fields, eg. AR/VR, autonomous driving. In this session, we
will introduce the theory and applications of MVS, analyze classical
and recent learning-based MVS, present three papers of our team.
Finally, we discuss possible future directions for our research on
Learning for Interaction
Talk at Qualcomm by Fanbo Xiang, Yuzhe Qin, and Zhiao Huang, and
Fangchen Liu. (March 31st, 2020)
Artificial intelligence not only needs to perceive the world but also
needs to interact with the environment to accomplish specific goals. For
example, the tight coupling of perception and interaction will facilitate robots
or autonomous vehicles to make the decision by modeling the complex world. We
emphasize the importance of understanding the environment structure for
interaction tasks. We first talk about how we help agents interact with the
environment by understanding the structure of the environment state. By properly
abstracting the state space, we show that combining search algorithms and
reinforcement learning can largely improve the generalization ability and data
efficiency compared to previous methods. Next, we will talk about how learning
methods are applied to real-world problems. We have developed SAPIEN, a robotics
research platform that provides rich physical simulations and scenarios.
Finally, we will show that we can analyze 3D scenes directly through supervised
learning for the robot grasping problem.
Concepts and Graph-based Reasoning
Talk at Qualcomm by Jiayuan Gu, Tongzhou Mu, and
Hao Tang. (March 31st, 2020)
A good object-centric abstraction can enable an agent, like an
autonomous vehicle or a domestic robot, to adapt to a new
environment fast, without intensive and offline re-training. To
this end, (I) we propose a novel framework, Task-driven Entity
Abstraction (TEA), to learn task-relevant entities from raw visual
observations in an unsupervised fashion. TEA can provide
high-quality object discovery results, which in turn also benefits
solving new tasks in terms of compositional and spatial
generalizability. (II) Current graph neural networks (GNNs)
lack generalizability with respect to scales (graph sizes, graph
diameters, edge weights, etc..). Therefore, we propose an
architecture, IterGNN, to approximate iterative graph algorithms,
without supervised information about iteration numbers during
training. When solving the shortest path length problem, the final
model impressively generalized to graphs of diameter as large as
1000, while only trained on graphs of diameter less than 30, which
is far superior to existing GNN methods.
SU Lab Research Report of 2018-2019 (Understanding 3D Environments for Interactions)
Edited from Invited talks for CVPR2019/RSS2019 (updated on July 3, 2019)
The mission and big picture of research happening in SU Lab --- learning to interact with the environment.
It describes the extension of SU Lab's research focus from deep 3D representation learning to broader topics of artifical intelligence for interacting with the environment. Not all papers published in the year are included in the report. Missing topics are binary neural networks and adversarial defense.
Towards Attack-Agnostic Defense for 2D and 3D Recognition
Invited talk at the Workshop of AdvML in CVPR2019 (updated on July 3, 2019)
A summary of the work on 2D/3D adversarial defense in 2018-2019. The main messages are: (1) Lower-dimensional data seems to be easier to defend; and (2)Defending in lower resolution seems to be more attack agnostic.
Synthesize for Learning
Invited talk at 3DV workshop on Understanding 3D and Visuo-Motor Learning (updated in Sep, 2016)