Multi-Swin Transformer Based Spatio-Temporal Information Exploration for Compressed Video Quality Enhancement
Yu, Li; Wu, Shiyu; Gabbouj, Moncef (2024)
Yu, Li
Wu, Shiyu
Gabbouj, Moncef
2024
IEEE Signal Processing Letters
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202409278966
https://urn.fi/URN:NBN:fi:tuni-202409278966
Kuvaus
Peer reviewed
Tiivistelmä
Spatio-temporal information plays an important role in compressed video quality enhancement. Most advanced studies use deformable convolution or Swin transformer to explore spatio-temporal information. However, deformable convolution based methods may incur inaccurate motion compensation due to the compression artifacts and limited receptive fields. The Swin transformer based approaches are unable to fully explore the spatio-temporal information, limited by its rigid window-based mechanism. To solve the above problems, we propose a novel multi-Swin transformer-based network for compressed video quality enhancement to better explore spatio-temporal information. The whole workflow consists of the Local Alignment (LA) Module, the Global Refinement Fusion (GRF) Module, and the Quality Enhancement (QE) Module. The LA module roughly perceives the local motion through the deformable fusion. Subsequently, the GRF module employs the proposed multi-Swin transformer to enhance the spatio-temporal perception. Finally, the QE module effectively restores the texture details across various scales. Extensive experimental results prove the effectiveness of the proposed method.
Kokoelmat
- TUNICRIS-julkaisut [22385]