Zhiheng Liu

I am a master candidate at the University of Science and Technology of China (USTC), supervised by Yang Cao.

My research interests lie in AIGC, currently focusing on the generation and editing of video and 3D vision.

I am always open to research discussions and collaborations, feel free to get in touch!

profile photo

I weighed about 70 kilograms when this photo was taken, but now I'm at about 85 kilograms and currently on a diet. :)

News
  • [3. 2024] DreamVideo accepted to CVPR 2024.
  • [1. 2024] DreamClean accepted to ICLR 2024.
  • [12. 2023] This page is online. Discussions and collaborations are welcome.
-->
Publications

(*: Equal contribution)

Adding Conditional Controls to Text-to-Image Consistency Models
Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha
arxiv, 2023
pdf/ page

We consider alternative strategies for adding ControlNet-like conditional control to CMs and present three significant findings. 1) ControlNet trained for diffusion models (DMs) can be directly applied to CMs for high-level semantic controls but struggles with low-level detail and realism control. 2) CMs serve as an independent class of generative models, based on which ControlNet can be trained from scratch using Consistency Training proposed by Song et al. 3) A lightweight adapter can be jointly optimized under multiple conditions through Consistency Training, allowing for the swift transfer of DMs-based ControlNet to CMs.

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei, Shiwei Zhang, Zhiwu Qing, Hangjie Yuan, Zhiheng Liu, Yu Liu, Yingya Zhang, Jingren Zhou, Hongming Shan
CVPR, 2024
pdf/ page

This work presents Dreamvideo, a method can customize both subject identity and motion pattern to generate desired videos with various context descriptions.

LivePhoto: Real Image Animation with Text-guided Motion Control
Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao
arxiv, 2023
pdf/ page

We present LivePhoto. Besides adequately decoding motion descriptions like actions and camera movements (row 1), LivePhoto could also conjure new contents from thin air (row 2). Meanwhile, LivePhoto is highly controllable, supporting users to customize the animation by inputting various texts (row 3) and adjusting the degree of motion intensity (row 4).

Cones 2: Customizable Image Synthesis with Multiple Subjects
Zhiheng Liu*, Yifei Zhang*, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
NeurIPS, 2023
pdf / page

Cones 2 uses a simple yet effective representation to register a subject. The storage space required for each subject is approximately 5 KB. Moreover, Cones 2 allows for the flexible composition of various subjects without any model tuning.

Cones: Concept Neurons in Diffusion Models for Customized Generation
Zhiheng Liu*, Ruili Feng*, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
ICML, 2023 Oral
pdf / code

We explore the subject-specific concept neurons in a pre-trained text-to-image diffusion model. Concatenating multiple clusters of concept neurons representing different persons, objects, and backgrounds can flexibly generate all related concepts in a single image.


Design and source code from Jon Barron's website