Hardware Deceleration of Kvazaar HEVC Encoder
Sainio, Joose; Mercat, Alexandre; Vanne, Jarno (2019-10-04)
Sainio, Joose
Mercat, Alexandre
Vanne, Jarno
04.10.2019
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-201909263507
https://urn.fi/URN:NBN:fi:tuni-201909263507
Kuvaus
Peer reviewed
Tiivistelmä
High Efficiency Video Coding (HEVC) doubles the coding efficiency
of the prior Advanced Video Coding (AVC) standard but tackling its huge com-
plexity calls for efficient HEVC codec implementations. The recent advances in
Graphics Processing Units (GPUs) have made programmable general-purpose
GPUs (GPGPUs) a popular option for accelerating various video coding tools.
Massively parallel GPU architectures are particularly well suited for hardware-
oriented full search (FS) algorithm in HEVC integer motion estimation (IME).
This paper analyzes the feasibility of a GPU-accelerated FS implementation in
the practical Kvazaar open-source HEVC encoder. According to our evaluations,
implementing FS on AMD Radeon RX 480 GPU makes Kvazaar 12.5 times as
fast as the respective anchor implemented entirely on an Intel 8-core i7 processor.
However, the obtained speed gain is lost when fast IME algorithms are put into
use in the anchor. For example, executing the anchor with hexagon-based search
(HEXBS) algorithm is almost two times as fast as our GPU-accelerated proposal
and the benefit of GPU offloading is reduced to a slight coding gain of 1.2%. Our
results show that accelerating IME on a GPU speeds up non-practical encoders
due to their enormous inherent complexity but the price paid with practical en-
coders tends to be too high. Conditional processing schemes of fast IME algo-
rithms can be efficiently executed on processors without any substantial coding
loss over that of FS. Nevertheless, we still believe there might be room for ex-
ploiting GPU on IME acceleration but GPU-parallelized fast algorithms are
needed to get value for additional implementation cost and power budget.
of the prior Advanced Video Coding (AVC) standard but tackling its huge com-
plexity calls for efficient HEVC codec implementations. The recent advances in
Graphics Processing Units (GPUs) have made programmable general-purpose
GPUs (GPGPUs) a popular option for accelerating various video coding tools.
Massively parallel GPU architectures are particularly well suited for hardware-
oriented full search (FS) algorithm in HEVC integer motion estimation (IME).
This paper analyzes the feasibility of a GPU-accelerated FS implementation in
the practical Kvazaar open-source HEVC encoder. According to our evaluations,
implementing FS on AMD Radeon RX 480 GPU makes Kvazaar 12.5 times as
fast as the respective anchor implemented entirely on an Intel 8-core i7 processor.
However, the obtained speed gain is lost when fast IME algorithms are put into
use in the anchor. For example, executing the anchor with hexagon-based search
(HEXBS) algorithm is almost two times as fast as our GPU-accelerated proposal
and the benefit of GPU offloading is reduced to a slight coding gain of 1.2%. Our
results show that accelerating IME on a GPU speeds up non-practical encoders
due to their enormous inherent complexity but the price paid with practical en-
coders tends to be too high. Conditional processing schemes of fast IME algo-
rithms can be efficiently executed on processors without any substantial coding
loss over that of FS. Nevertheless, we still believe there might be room for ex-
ploiting GPU on IME acceleration but GPU-parallelized fast algorithms are
needed to get value for additional implementation cost and power budget.
Kokoelmat
- TUNICRIS-julkaisut [19385]