Still image coding for machines : an end-to-end learned approach
Le, Nam (2020)
Le, Nam
2020
Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2020-12-01
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202011248199
https://urn.fi/URN:NBN:fi:tuni-202011248199
Tiivistelmä
The ever-increasing pace of neural network (NN) based solutions for computer vision tasks is making them one of the main consumers of digital images nowadays. This raises the question of whether the traditional human-oriented image codecs, or the adapted version of these codecs for the machine-targeted use cases are efficient enough for the massive amount of image data generated every day for both humans and machines. This thesis explores the abilities of the image codecs that are designed specifically only for machine-consumption. To the best of the student’s knowledge, this is the first end-to-end learned machine-oriented image codec proposal. It presents an end-to-end framework for designing NN-based image codecs for machines, as well as a set of training strategies that address the delicate problem of balancing competing losses in multi-task training, namely image distortion loss, rate loss, and computer vision task losses. The experimental results show the superior coding efficiency of the proposed codecs in comparison with the current state-of-the-art standard VVC/H.266 on object detection and instance segmentation, achieving -37.87% and -32.90% BD-rate gain, respectively while being extremely fast thanks to its compact size. These results also serve as a proof-of-concept for a new approach to Image coding for machines.