Task Enhancement Tiles for Ultra Lightweight Post-processing in Visual Coding for Machines
Partanen, Tero; Marie, Alban; Kortelahti, Rudolf; Mercat, Alexandre; Vanne, Jarno; Hannuksela, Miska M.; Zhang, Honglei; Aminlou, Alireza; Cricri, Francesco (2025)
Partanen, Tero
Marie, Alban
Kortelahti, Rudolf
Mercat, Alexandre
Vanne, Jarno
Hannuksela, Miska M.
Zhang, Honglei
Aminlou, Alireza
Cricri, Francesco
2025
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202603103097
https://urn.fi/URN:NBN:fi:tuni-202603103097
Kuvaus
Peer reviewed
Tiivistelmä
The proliferation of automated visual analysis calls for compression methods tailored to the unique requirements of Video Coding for Machines (VCM). In this paper, we propose a computationally lightweight post-processing method that is based on a learned component referred to as a task enhancement tile (TET). A TET is spatially tiled over the reconstructed visual data and added to it element-wise. It only requires one addition per pixel in each color channel before the machine task can be applied. Our results with the VVC test model (VTM) demonstrate coding gains of up to 39.0% for object detection and 29.2% for instance segmentation on image datasets, while evaluation on a video dataset shows gains of up to 35.2% for object detection, relative to the VTM anchor. The proposed solution also offers extremely low computational cost, preservation of human-viewable content, full compliance with video coding standards, no requirement for side information transmission from encoder to decoder, and generalization across tasks, models, and encoding parameters.
Kokoelmat
- TUNICRIS-julkaisut [24742]
