Mondo-Robotics/DiT4DiT
This is the official code repo for DiT4DiT, a Vision-Action-Model (VAM) framework that combines video generation model with flow-matching-based action prediction for generalizable robotic manipulation.
GitHub repository with 316 stars and 14 forks.
Language: Python