Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

Region of Interest Enabled Learned Image Coding for Machines

Ahonen, Jukka I.; Le, Nam; Zhang, Honglei; Cricri, Francesco; Rahtu, Esa (2023)

 
Avaa tiedosto
Ahonen_MMSP_ROI_LIC_camera_ready.pdf (2.415Mt)
Lataukset: 



Ahonen, Jukka I.
Le, Nam
Zhang, Honglei
Cricri, Francesco
Rahtu, Esa
2023

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
doi:10.1109/MMSP59012.2023.10337731
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202504023193

Kuvaus

Peer reviewed
Tiivistelmä
<p>Image and video coding for machines has been recently gaining more and more interest from both the industry and the research community. One successful approach is based on end-to-end (E2E) learned compression and has shown significant gains over the state-of-the-art conventional image coding methods. However, one of the remaining challenges for such E2E-learned image codecs for machines is to adaptively allocate the bits over different regions of the image, while retaining the machine vision performance. In this paper, we propose a method that leverages Regions-Of-Interest (ROIs) for bitrate allocation within a Learned Image Codec (LIC) for machines. In particular, the proposed method reduces the bits allocated for the background regions of the image by reducing the variance of the elements corresponding to the background regions in the latent representation. This results in more heavily quantized background areas, while keeping the quality of the ROI areas suitable for machine tasks. The proposed method achieves significant gains, -15.80% and -22.43% Pareto BD-rate reduction, over the baseline LIC on object detection and instance segmentation tasks, respectively. To the best of our knowledge, this is the first research paper proposing an ROI-based inference-time technology for Learned Image Coding for machines.</p>
Kokoelmat
  • TUNICRIS-julkaisut [22191]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste