Embodied AI / VLA / WAM / RL

Jiangang Li

I build robot learning systems that connect perception, language, world dynamics, and action.

Focus

My work centers on embodied intelligence: turning multimodal observations and natural-language goals into reliable robot behavior. I am especially interested in scalable action representations, world-action models, VLA policy learning, data-centric training pipelines, and RL-based post-training for real robot deployment.

Research Directions

Models that reason through action.

01

Embodied Intelligence

Generalist robot agents that ground task intent in physical state, object interaction, and safety-aware execution.

02

VLA Policies

Vision-language-action models for closed-loop control, instruction following, and cross-robot behavior transfer.

03

World-Action Models

Latent dynamics models that connect future visual states, command context, and action generation.

04

RL Post-Training

Reinforcement learning and correction loops that sharpen policies after supervised multimodal pretraining.

Current Systems

Robot learning infrastructure from data to deployment.

I work on clean-room embodied AI stacks that make action supervision explicit, reusable, and auditable across datasets, robots, and training stages.

OneroWAM

A world-action model training framework with open backbones, multimodal sequence construction, latent reasoning, flow matching, and real-robot deployment gates.

Profiled Action Contracts

A structured action interface for heterogeneous robot datasets, including masks, profiles, task-space semantics, adapter checks, and evidence artifacts.

Robot Data Systems

Dataset ingestion, canonical schema conversion, materialized training chunks, audit tools, and replay or shadow validation before hardware execution.

Writing and Artifacts

Notes, papers, and engineering traces.

2026

Profiled Action Contracts for Robot Learning

Paper draft and review artifacts around action-space contracts, dataset adapters, robot output adapters, and reproducible evidence tables.

Ongoing

World-Action Model Training

Experiments on multimodal backbones, prior-posterior latent structure, future-state supervision, and data mixture design.

Ongoing

RL and Real-Robot Validation

Workflows for post-training, correction data, safety gates, replay, shadow execution, and staged hardware rollout.

Contact

Open to focused conversations about embodied AI systems.