
This research proposes an end-to-end pipeline that converts multi-view images into photorealistic 3D Gaussian Splatting (3DGS) reconstructions and derives object-centric, physics-enabled virtual reality (VR) scenes capable of supporting real-time rigid-body interaction. The pipeline leverages semantically prompted 2D instance masks to segment and extract per-object Gaussian assets, performs completion to address occluded regions, and generates mesh-based proxies to support rigid-body physics simulation. At runtime, rigid-body pose synchronization couples each physics proxy with its corresponding 3DGS object to ensure consistent visual rendering and physically plausible interaction. The work aims to provide a practical framework and evaluation methodology for bridging neural rendering-quality scene representations with interactive physics simulation for immersive environments.
