Zhiheng Liu (刘志恒)

Zhiheng Liu

I am a master candidate at the University of Science and Technology of China (USTC), supervised by Yang Cao.

My research interests lie in AIGC, currently focusing on the generation and editing of video and 3D vision.

I am always open to research discussions and collaborations, feel free to get in touch!

I weighed about 70 kilograms when this photo was taken, but now I'm at about 85 kilograms and currently on a diet. :)

News

[3. 2024] DreamVideo accepted to CVPR 2024.
[1. 2024] DreamClean accepted to ICLR 2024.
[12. 2023] This page is online. Discussions and collaborations are welcome.

-->

Publications

(*: Equal contribution)

	Adding Conditional Controls to Text-to-Image Consistency Models Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha arxiv, 2023 pdf/ page We consider alternative strategies for adding ControlNet-like conditional control to CMs and present three significant findings. 1) ControlNet trained for diffusion models (DMs) can be directly applied to CMs for high-level semantic controls but struggles with low-level detail and realism control. 2) CMs serve as an independent class of generative models, based on which ControlNet can be trained from scratch using Consistency Training proposed by Song et al. 3) A lightweight adapter can be jointly optimized under multiple conditions through Consistency Training, allowing for the swift transfer of DMs-based ControlNet to CMs.
	DreamVideo: Composing Your Dream Videos with Customized Subject and Motion Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu Liu, Yingya Zhang, Jingren Zhou, Hongming Shan CVPR, 2024 pdf/ page This work presents Dreamvideo, a method can customize both subject identity and motion pattern to generate desired videos with various context descriptions.
	LivePhoto: Real Image Animation with Text-guided Motion Control Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao arxiv, 2023 pdf/ page We present LivePhoto. Besides adequately decoding motion descriptions like actions and camera movements (row 1), LivePhoto could also conjure new contents from thin air (row 2). Meanwhile, LivePhoto is highly controllable, supporting users to customize the animation by inputting various texts (row 3) and adjusting the degree of motion intensity (row 4).
	Cones 2: Customizable Image Synthesis with Multiple Subjects Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao NeurIPS, 2023 pdf / page Cones 2 uses a simple yet effective representation to register a subject. The storage space required for each subject is approximately 5 KB. Moreover, Cones 2 allows for the flexible composition of various subjects without any model tuning.
	Cones: Concept Neurons in Diffusion Models for Customized Generation Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao ICML, 2023 Oral pdf / code We explore the subject-specific concept neurons in a pre-trained text-to-image diffusion model. Concatenating multiple clusters of concept neurons representing different persons, objects, and backgrounds can flexibly generate all related concepts in a single image.

Design and source code from Jon Barron's website