Zhiheng Liu
I am a first-year Ph.D. student at the Department of Computer Science, The University of Hong Kong (HKU), advised by Prof. Ping Luo.
Before that, I obtained my master degree from University of Science and Technology of China (USTC),
advised by Prof. Yang Cao.
My research interests lie in AIGC, currently focusing on the generation and editing of video and 3D vision. I aim to create explorable, interactive, immersive, and detailed worlds through a generative manner.
I am always open to research discussions and collaborations, feel free to get in touch!
Email  / 
Google Scholar  / 
Github
|
|
News
- [12. 2024] We release DepthLab, a robust depth inpainting foundation model that can be applied to various downstream tasks to enhance performance.
- [12. 2024] We release The Matrix, a foundation world model for generating infinite-length, hyper-realistic videos with real-time, frame-level control.
- [7. 2024] LivePhoto accepted to ECCV 2024.
- [5. 2024] CCM accepted to ICML 2024.
- [4. 2024] We release InFusion for 3D inpainting via diffusion prior.
- [3. 2024] DreamVideo accepted to CVPR 2024.
- [1. 2024] DreamClean accepted to ICLR 2024.
- [12. 2023] This page is online. Discussions and collaborations are welcome.
|
Selected Publications
(*: Equal contribution)
|
|
MangaNinja: Line Art Colorization with Precise Reference Following
Zhiheng Liu*, Ka Leong Cheng*, Xi Chen, Jie Xiao, Hao Ouyang, Kai Zhu, Yu Liu, Yujun Shen, Qicheng Chen, Ping Luo
arxiv, 2024
pdf/
page/
code
MangaNinja is a reference-based line art colorization method that enables precise matching and fine-grained interactive control.
|
|
DepthLab: From Partial to Complete
Zhiheng Liu*, Ka Leong Cheng*, Qiuyu Wang, Shuzhe Wang, Hao Ouyang, Bin Tan, Kai Zhu, Yujun Shen, Qicheng Chen, Ping Luo
arxiv, 2024
pdf/
page/
code
We propose a robust depth inpainting foundation model that can be applied to various downstream tasks to enhance performance.
|
|
InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior
Zhiheng Liu*, Hao Ouyang*, Qiuyu Wang, Ka Leong Cheng, Jie Xiao, Kai Zhu, Nan Xue, Yu Liu, Yujun Shen, Yang Cao
arxiv, 2024
pdf/
page/
code
We present an image-conditioned depth inpainting model, which uses the diffusion prior to inpaint 3D Gaussians and has very good geometric and texture consistency.
|
|
LivePhoto: Real Image Animation with Text-guided Motion Control
Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao
ECCV, 2024
pdf/
page
We present LivePhoto, a real image animation method with text control. Different from previous works, LivePhoto truely listens to the text instructions and well preserves the object-ID.
|
|
Cones 2: Customizable Image Synthesis with Multiple Subjects
Zhiheng Liu*, Yifei Zhang*, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
NeurIPS, 2023
pdf /
page
Cones 2 uses a simple yet effective representation to register a subject. The storage space required for each subject is approximately 5 KB. Moreover, Cones 2 allows for the flexible composition of various subjects without any model tuning.
|
|
Cones: Concept Neurons in Diffusion Models for Customized Generation
Zhiheng Liu*, Ruili Feng*, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
ICML, 2023 Oral
pdf /
code
We explore the subject-specific concept neurons in a pre-trained text-to-image diffusion model. Concatenating multiple clusters of concept neurons representing different persons, objects, and backgrounds can flexibly generate all related concepts in a single image.
|
|