PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

Chen Wang1*     Chuhao Chen1*     Yiming Huang 1     Zhiyang Dou1,2     Yuan Liu3     Jiatao Gu1     Lingjie Liu1    
1University of Pennsylvania      2HKU      3HKUST
(*: equal contribution)

NeurIPS, 2025

PhysCtrl achieves controllable and physics-grounded video generation from an initial force.

Abstract

Existing video generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability. To overcome these limitations, we introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control. At its core is a generative physics network that learns the distribution of physical dynamics across four materials (elastic, sand, plasticine, and rigid) via a diffusion model conditioned on physics parameters and applied forces. We represent physical dynamics as 3D point trajectories and train on a large-scale synthetic dataset of 550K animations generated by physics simulators. We enhance the diffusion model with a novel spatiotemporal attention block that emulates particle interactions and incorporates physics-based constraints during training to enforce physical plausibility. Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos that outperform existing methods in both visual quality and physical plausibility.

Pipeline

Teaser image

Given a single image, we lift the object in that image into 3D points. We train a diffusion-based trajectory generation model conditioned on physics parameters and external force for motion generation, which are then used as strong physics-grounded guidance for image-to-video generation.

Force Control

Material Control

Comparison

A pair of wireless headphones rests on a white table before lifting into the air, as if there is an invisble force applied to its handle.
ObjCtrl
DragAnything
CogVideo
Wan2.2
Ours
A yellow plasticine dinasour toy free falls to the ground due to gravity. It has no deformation before it touches the ground. After it touches the ground, it deforms.
ObjCtrl
DragAnything
CogVideo
Wan2.2
Ours
the penguin is fully lifted upwards and float into the air with a natural motion, as if there is a force applied onto its left wing. No webbed feet, realistic claws and flippers.
ObjCtrl
DragAnything
CogVideo
Wan2.2
Ours
A black cylindrical pipe lies on a wooden surface before rising and bending at a sharp angle. The transformation is smooth and fluid, as if an invisible upward force is applied in the middle of the pipe.
ObjCtrl
DragAnything
CogVideo
Wan2.2
Ours

BibTeX


@inproceedings{physctrl2025,
  Author = {Chen Wang* and Chuhao Chen* and Yiming Huang and Zhiyang Dou and Yuan Liu and Jiatao Gu and Lingjie Liu},
  Title = {PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation},
  Year = {2025},
  booktitle={NeurIPS},
}