Skip to content

[CVPR 2026] ForeAct: Steering Your VLA with Efficient Visual Foresight Planning

Notifications You must be signed in to change notification settings

mit-han-lab/foreact

Repository files navigation

ForeAct: Steering Your VLA with Efficient Visual Foresight Planning

About

  • ForeAct is a visual foresight planner that empowers VLAs with the ability to anticipate future observations, enabling more informed decision-making.
  • ForeAct is general and plug-and-play: state-of-the-art VLAs can seamlessly incorporate ForeAct without any architectural modification.
  • ForeAct is highly efficient, generating a high-fidelity 640 $\times$ 480 future observation in just 0.33s on a single H100 GPU.

Demo

Watch the video

Usage

Environment Setup

git clone https://github.com/mit-han-lab/foreact
cd foreact
bash environment_setup.sh foreact

Finetune

Download the pretrained weights and prepare your own real-world data (or use our processed real-world data). Update the relevant paths in configs/finetune.yaml, then launch:

bash scripts/run_finetune.sh

Inference

### CLI
python app_cli.py --checkpoint_path path/to/model --prompt "" --input_image path/to/image  --output_dir ./results

### Gradio
python app.py --checkpoint_path path/to/model

VLA Training

We provide examples regarding policy training in ./third-party/lerobot.

Acknowledgements

Thanks to metaquery, diffusers, lerobot for the wonderful open-source codebase.

About

[CVPR 2026] ForeAct: Steering Your VLA with Efficient Visual Foresight Planning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors