Generative AI. Responsible AI

LLM agents for visual reasoning and embodied AI. Diffusion models for 3D synthesis. Privacy-aware, fair and robust computer vision.

Research Home

LLM agents and embodied AI: Solving long-horizon tasks with LLM planners for program synthesis and tool usage, or learning hierarchical policies.

My photo

Foundational vision-language models: Multimodal data-efficient pretraining, data augmentation and question decomposition.

Generative models: Controllable diffusion models for augmented reality and autonomous driving simulation. High-resolution 3D-aware GANs.

My photo

Robustness and fairness: Algorithms and datasets for robust and equitable AI despite data biases, geographic variations and social boundaries.

My photo

Privacy: Differential privacy, federated learning and computational sensors to safeguard personally identifiable information in visual data.

Vision-language applications: Open vocabulary detection and automated open world AI devops using VLMs and LLMs.