Top: To remove a target from the optimized 3D Gaussians, our Infusion first inpaints a selected one-view RGB image and applies the proposed diffusion model for depth inpainting to the depth projection of the targeted 3D Gaussians. The progressive scheme addresses view-dependent occlusion issues by utilizing other unobstructed viewpoints.
Bottom: A detailed view of the training pipeline for the depth inpainting U-Net is presented. We employ a mask-driven denoising diffusion for training of the U-Net, which utilizes a frozen latent tokenizer by taking the RGB image and depth map as inputs.
@article{liu2024infusion,
title={InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior},
author={Zhiheng Liu, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jie Xiao, Kai Zhu, Nan Xue, Yu Liu, Yujun Shen, Yang Cao},
journal={arXiv preprint arXiv:2404.11613},
year={2024}
}