Publications

Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting

Authors: Kaidong Zhang, Jialun Peng, Jingjing Fu, Dong Liu

Published in IEEE TPAMI, 2023, 2024

This paper is a journal extension of FGT. In this paper, we reformulate the research motivation and propose more methods to exploit the guidance from completed optical flows to transformer-based video inpainting, including the flow-guided feature propagation module and the newly designed temporal deformable MHSA in temporal transformer block. Besides, we also explore the supervision from frequency domain in video inpainting. FGT++ achieves greatly improved compared with FGT and current existing video inpainting baselines.

Recommended citation: Kaidong Zhang, Jialun Peng, Jingjing Fu, Dong Liu, "Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting." IEEE TPAMI, 2024.

Customized Segment Anything Model for Medical Image Segmentation

Authors: Kaidong Zhang, Dong Liu

Published in Arxiv preprint, 2023, 2023

We propose SAMed, which firstly adopt Segment Anything Model (SAM) in medical image semantic segmentation. In consideration of performance, deployment and storage overhead comprehensively, we adopt low rank approximation technology to customize a small fraction of parameters in image encoder of SAM. With the finetuning of mask decoder and the prompt encoder and a series training strategies, we achieve highly competitive performance on Synapse multi-organ segmentation dataset.

Recommended citation: Kaidong Zhang, Dong Liu, "Customized Segment Anything Model for Medical Image Segmentation." Arxiv preprint, 2023.

Flow-Guided Transformer for Video Inpainting

Authors: Kaidong Zhang, Jingjing Fu, Dong Liu

Published in In the proceedings of European Conference on Computer Vision (ECCV), 2022

We first integrate the philosophy of flow-guided method to transformer for video inpainting with reasonable structure and detailed texture simultaneously. We exploit the local correlation of motion fields and adopt the motion discrepancy in the completed flows to guide the transformer-based content synthesis. We also design elaborated window-partition and spatial-temporal decoupled transformer strategy for the balance between efficiency and performance.

Recommended citation: Kaidong Zhang, Jingjing Fu, Dong Liu, "Flow-Guided Transformer for Video Inpainting." In the proceedings of European Conference on Computer Vision (ECCV), 2022.

Inertia-Guided Flow Completion and Style Fusion for Video Inpainting

Authors: Kaidong Zhang, Jingjing Fu, Dong Liu

Published in In the proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

We show the benefit of explicit inertia prior for flow completion, which leads to more accurate flow-guided content propagation for video inpainting. We also first discuss the style incoherence caused by flow warping across different frames and propose the style fusion mechanism to refine the style in the warped regions under the guidance of the styles from valid regions.

Recommended citation: Kaidong Zhang, Jingjing Fu, Dong Liu, "Inertia-Guided Flow Completion and Style Fusion for Video Inpainting." In the proceedings of Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.