Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Lightweight Multitask Learning for Robust JND Prediction using Latent Space and Reconstructed Frames

Nami, Sanaz; Pakdaman, Farhad; Hashemi, Mahmoud Reza; Shirmohammadi, Shervin; Gabbouj, Moncef (2024)

 
Avaa tiedosto
Lightweight_Multitask_Learning.pdf (17.90Mt)
Lataukset: 



Nami, Sanaz
Pakdaman, Farhad
Hashemi, Mahmoud Reza
Shirmohammadi, Shervin
Gabbouj, Moncef
2024

IEEE Transactions on Circuits and Systems for Video Technology
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
doi:10.1109/TCSVT.2024.3389988
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202405075489

Kuvaus

Peer reviewed
Tiivistelmä
<p>The Just Noticeable Difference (JND) refers to the smallest distortion in an image or video that can be perceived by Human Visual System (HVS), and is widely used in optimizing image/video compression. However, accurate JND modeling is very challenging due to its content dependence, and the complex nature of the HVS. Recent solutions train deep learning based JND prediction models, mainly based on a Quantization Parameter (QP) value, representing a single JND level, and train separate models to predict each JND level. We point out that a single QP-distance is insufficient to properly train a network with millions of parameters, for a complex content-dependent task. Inspired by recent advances in learned compression and multitask learning, we propose to address this problem by (1) learning to reconstruct the JND-quality frames, jointly with the QP prediction, and (2) jointly learning several JND levels to augment the learning performance. We propose a novel solution where first, an effective feature backbone is trained by learning to reconstruct JND-quality frames from the raw frames. Second, JND prediction models are trained based on features extracted from latent space (i.e., compressed domain), or reconstructed JND-quality frames. Third, a multi-JND model is designed, which jointly learns three JND levels, further reducing the prediction error. Extensive experimental results demonstrate that our multi-JND method outperforms the state-of-the-art and achieves an average JND<sub>1</sub> prediction error of only 1.57 in QP, and 0.72 dB in PSNR. Moreover, the multitask learning approach, and compressed domain prediction facilitate light-weight inference by significantly reducing the complexity and the number of parameters.</p>
Kokoelmat
  • TUNICRIS-julkaisut [20143]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste