InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

Zhiheng Liu^1,4*, Hao Ouyang^2,3*, Qiuyu Wang³, Ka Leong Cheng^2,3, Jie Xiao^1,4, Kai Zhu⁴, Nan Xue³, Yu Liu⁴, Yujun Shen³, Yang Cao^1†

¹USTC ²HKUST ³Ant Group ⁴Alibaba Group

Paper Code

Gaussian Inpainting

Point Cloud & Mesh

Pipeline

Top: To remove a target from the optimized 3D Gaussians, our Infusion first inpaints a selected one-view RGB image and applies the proposed diffusion model for depth inpainting to the depth projection of the targeted 3D Gaussians. The progressive scheme addresses view-dependent occlusion issues by utilizing other unobstructed viewpoints.

Bottom: A detailed view of the training pipeline for the depth inpainting U-Net is presented. We employ a mask-driven denoising diffusion for training of the U-Net, which utilizes a frozen latent tokenizer by taking the RGB image and depth map as inputs.

InFusion allows users to modify the appearance and texture of targeted areas with ease.

Infusion allows user to project objects into a real three-dimensional scene through editing a single image.

InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

Gaussian Inpainting

Point Cloud & Mesh

Pipeline

Texture Editing

Object Insertion & Completion

BibTeX