Adaptive Luma Range Optimization in Visual Coding for Machines
Partanen, Tero; Mercat, Alexandre; Vanne, Jarno; Hannuksela, Miska M.; Zhang, Honglei; Aminlou, Alireza; Cricri, Francesco (2025)
Lataukset:
Partanen, Tero
Mercat, Alexandre
Vanne, Jarno
Hannuksela, Miska M.
Zhang, Honglei
Aminlou, Alireza
Cricri, Francesco
2025
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202603093040
https://urn.fi/URN:NBN:fi:tuni-202603093040
Kuvaus
Peer reviewed
Tiivistelmä
The growing prevalence of machine-driven visual data consumption underscores the need to meet the unique requirements of Video Coding for Machines (VCM). In this paper, we propose to enhance the coding efficiency of Versatile Video Coding (VVC) for machine consumption by adaptively adjusting the dynamic range of the input luma channel prior to encoding. The visual input is characterized using the introduced input analyzer that predicts the optimal dynamic range and provides the corresponding 1) luma down-scaling factors applied before encoding and 2) luma up-scaling factors used after decoding to restore the dynamic range. Our input analyzer is implemented as a lightweight neural network. For the network training, we introduce a training framework incorporating a codec proxy module that enables end-to-end optimization by simulating a conventional non-differentiable video codec. The proposed method has been evaluated as part of the conventional VVC pipeline, where VVC test model (VTM) is used for encoding and decoding. Our experimental results show that integrating the proposed solution into the pipeline improves coding efficiency by up to 28.0% on image datasets and up to 45.4% on video dataset for object detection tasks.
Kokoelmat
- TUNICRIS-julkaisut [24611]
