Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

NN-VVC: A Hybrid Learned-Conventional Video Codec Targeting Humans and Machines

Ahonen, Jukka I.; Le, Nam; Zhang, Honglei; Hallapuro, Antti; Cricri, Francesco; Tavakoli, Hamed Rezazadegan; Hannuksela, Miska M.; Rahtu, Esa (2024)

 
Avaa tiedosto
IJSC_NN-VVC_hybridVideoCodec_clean.pdf (20.45Mt)
Lataukset: 



Ahonen, Jukka I.
Le, Nam
Zhang, Honglei
Hallapuro, Antti
Cricri, Francesco
Tavakoli, Hamed Rezazadegan
Hannuksela, Miska M.
Rahtu, Esa
2024

INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
doi:10.1142/S1793351X24450041
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202504023196

Kuvaus

Peer reviewed
Tiivistelmä
Advancements in artificial intelligence have significantly increased the use of images and videos in machine analysis algorithms, predominantly neural networks. However, the traditional methods of compressing, storing and transmitting media have been optimized for human viewers rather than machines. Current research in coding images and videos for machine analysis has evolved in two distinct paths. The first is characterized by End-to-End (E2E) learned codes, which show promising results in image coding but have yet to match the performance of leading Conventional Video Codecs (CVC) and suffer from a lack of interoperability. The second path optimizes CVC, such as the Versatile Video Coding (VVC) standard, for machine-oriented reconstruction. Although CVC-based approaches enjoy widespread hardware and software compatibility and interoperability, they often fall short in machine task performance, especially at lower bitrates. This paper proposes a novel hybrid codec for machines named NN-VVC, which combines the advantages of an E2E-learned image codec and a CVC to achieve high performance in both image and video coding for machines. Our experiments show that the proposed system achieved up to - 43.20% and - 26.8% Bjøntegaard Delta rate reduction over VVC for image and video data, respectively, when evaluated on multiple different datasets and machine vision tasks according to the common test conditions designed by the VCM study group in MPEG standardization activities. Furthermore, to improve reconstruction quality, we introduce a human-focused branch into our codec, enhancing the visual appeal of reconstructions intended for human supervision of the machine-oriented main branch.
Kokoelmat
  • TUNICRIS-julkaisut [22206]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste