Compressed Image Super-resolution (CSR) aims to simultaneously super-resolve the compressed images and tackle the challenging hybrid distortions caused by compression. However, existing works on CSR usually focus on single compression codec, i.e., JPEG, ignoring the diverse traditional or learning-based codecs in the practical application, e.g., HEVC, VVC, HIFIC, etc. In this work, we propose the first universal CSR framework, dubbed UCIP, with dynamic prompt learning, intending to jointly support the CSR distortions of any compression codecs/modes. Particularly, an efficient dynamic prompt strategy is proposed to mine the content/spatial-aware task-adaptive contextual information for the universal CSR task, using only a small amount of prompts with spatial size 1x1. To simplify contextual information mining, we introduce the novel MLP-like framework backbone for our UCIP by adapting the Active Token Mixer (ATM) to CSR tasks for the first time, where the global information modeling is only taken in horizontal and vertical directions with offset prediction. We also build an all-in-one benchmark dataset for the CSR task by collecting the datasets with the popular 6 diverse traditional and learning-based codecs, including JPEG, HEVC, VVC, HIFIC, etc., resulting in 23 common degradations. Extensive experiments have shown the consistent and excellent performance of our UCIP on universal CSR tasks.
In this work, we propose the first universal framework with dynamic prompt strategy and MLP-like module to tackle the challenge CSR tasks. Benifitting from our dynamic prompt-based offset learning, our UCIP is capable of encoding optimal content-aware contextual information, while maintaining the task-aware adaptability via prompt components. For more information, please refer to our paper.
We build an all-in-one benchmark dataset for the compressed image super-resolution (CSR) task by collecting the datasets with the popular 6 diverse traditional and learning-based codecs, including traditional codecs: JPEG, HM, VTM; and learning-based codecs: \( C_{\text{PSNR}} \), \( C_{\text{SSIM}} \), HIFIC, resulting in 23 common degradations. We list the detailed quality factor (QF), quantization parameter (QP) and compression mode (Mode) in the following (From left to right: poorer quality -> better quality):
@inproceedings{li2024ucip,
title={UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt},
author={Li, Xin and Li, Bingchen and Jin, Yeying and Lan, Cuiling and Zhu, Hanxin and Ren, Yulin and Chen, Zhibo},
booktitle={European Conference on Computer Vision},
year={2024},
organization={Springer}
}