LLM agents and embodied AI: Solving long-horizon tasks with LLM planners for program synthesis and tool usage, or learning hierarchical policies.

Foundational vision-language models: Multimodal data-efficient pretraining, data augmentation and question decomposition.
Generative models: Controllable diffusion models for augmented reality and autonomous driving simulation. High-resolution 3D-aware GANs.

Robustness and fairness: Algorithms and datasets for robust and equitable AI despite data biases, geographic variations and social boundaries.

Privacy: Differential privacy, federated learning and computational sensors to safeguard personally identifiable information in visual data.
Vision-language applications: Open vocabulary detection and automated open world AI devops using VLMs and LLMs.