This website is under reconstruction
Monocular Real-to-Sim Scene Programming
Programming Interactive Scenes from Monocular Images for Embodied Simulation
NeoWorld-Pro transforms a single RGB image into executable, simulation-ready interactive scenes with programmable geometry, articulation, physical properties, and scene layout.
Shanghai Jiao Tong University
Interactive Demo
Inspect generated assets and actuate their articulated parts in the browser.
We are actively adding more demos.
USD Preview
Loading demo
Preparing the interactive viewer.
Dataset and Benchmark
A multi-object simulation benchmark for physically executable scenes.
NeoWorld-Pro is evaluated on PartNet-Mobility and a newly constructed synthetic scene benchmark designed to stress-test monocular reconstruction in physically realistic multi-object environments. The benchmark tasks cover placement, assembly, articulation, and interaction with task-relevant affordances.
Across 84 downstream manipulation tasks, NeoWorld-Pro achieves a 92.85% task success rate.
Results
Closed-loop programming improves reconstruction, articulation, and downstream interaction.
Object-Level Appearance and Geometry
| Method | Appearance Evaluation (Image) | Similarity | Geometry Evaluation | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| SSIM↑ | LPIPS↓ | FID↓ | KID×100↓ | CLIP↑ | Uni3D↑ | CD×10↓ | F@0.01↑ | F@0.05↑ | F@0.1↑ | |
| Articulate-Anything | 0.7646 | 0.2930 | 142.46 | 2.3116 | 0.7056 | 0.2218 | 0.2588 | 12.94 | 49.14 | 70.69 |
| PhysX-Anything | 0.7657 | 0.2991 | 93.42 | 0.4032 | 0.7937 | 0.1921 | 0.3744 | 10.14 | 41.16 | 62.65 |
| NeoWorld-Pro | 0.8398 | 0.1864 | 72.59 | 0.2878 | 0.8125 | 0.3522 | 0.2065 | 25.77 | 58.31 | 75.47 |
Articulation and Kinematics
| Method | Total | #Pred | #Hit | Miss↓ | Axis↓ | Pivot↓ | Type↓ |
|---|---|---|---|---|---|---|---|
| Articulate-Anything | 256 | 175 | 153 | 40% | 47.70 | 1.54 | 23.62% |
| PhysX-Anything | 256 | 188 | 136 | 47% | 26.91 | 1.49 | 26.17% |
| NeoWorld-Pro | 256 | 305 | 238 | 7% | 16.50 | 1.09 | 5.63% |
Citation
BibTeX
@misc{he2026NeoWorld-Pro,
title = {NeoWorld-Pro: Programming Interactive Scenes from Monocular Images for Embodied Simulation},
author = {He, Yumeng and Song, Yichen and Yang, Xiaotian and Zhang, Weijia and Zhou, Zanwei and Gong, Junru and Yang, Xiaokang and Wang, Yunbo},
year = {2026},
note = {Project page}
}