Low-Latency and High-Bandwidth Video Stream Delivery
Iho, Jussi (2021)
Iho, Jussi
2021
Tietotekniikan DI-ohjelma - Master's Programme in Information Technology
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2021-05-05
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202104263673
https://urn.fi/URN:NBN:fi:tuni-202104263673
Tiivistelmä
Most conventional video formats have been designed to produce high compression ratios through the use of sophisticated video coding methods, resulting in dramatic reductions of required bandwidth. The complexity of these standards is however also their weakness, as the inherent compute intensity adds to the video latency. Therefore, low-latency video delivery continues to be a challenging problem, especially for low-power mobile and IoT devices.
An alternative approach has been proposed, where certain texture compression formats, commonly used in computer graphics, would be used for real-time video compression at a very low latency. These formats would allow the use of fast, hardware-accelerated render-time decoding, while also lending themselves well to low-latency GPU encoding and parallel computation in general.
The main drawback would come in the form of low compression ratios, but it has become a lesser issue, thanks to the recent advent of new high-bandwidth wireless networking technologies, such as the 802.11ax (WiFi 6) and 5G standards. Regardless, they still have only an enabling role, as general-purpose platforms still depend on largely software-based network I/O solutions.
The objective of this work, is in studying the technical aspects of texture-compressed stream delivery, and finding the best strategies for the performance optimization of the Linux kernel network stacks using general-purpose hardware. A video streaming pipeline optimized for texture formats is proposed, utilizing multi-threading, GPU acceleration, and network stack performance tuning.
As a result, the pipeline was found to be capable of reaching very low latencies in the case of high-bandwidth networks, when extrapolating from performance measurements in localhost TCP and UDP tests. As an example, a 2160p frame encoded in the BC1 format, could be delivered with a total end-to-end latency of under 10 ms, although it would require a 10 Gbit/s network and a high core count\dashspace{}-CPU. The achieved bandwidths were 31.7 Gbit/s and 25.3 Gbit/s for the proposed TCP and UDP implementations respectively. As the latency is roughly proportional to the frame size and network bandwidth, using a higher compression ratio format or more bandwidth could easily bring the 4320p performance to a similar level with the 2160p results.
The use of texture compression in video delivery was concluded to be on the edge of viability for the aforementioned low-power systems in wireless networks. The limiting factors are the network performance and the CPU-overheads in the Linux network stack. While significant improvements in device compute performances are unlikely to be seen in the near future, advancements in the networking capabilities of consumer hardware could, however, be enough to make texture compressed video delivery a reality.
An alternative approach has been proposed, where certain texture compression formats, commonly used in computer graphics, would be used for real-time video compression at a very low latency. These formats would allow the use of fast, hardware-accelerated render-time decoding, while also lending themselves well to low-latency GPU encoding and parallel computation in general.
The main drawback would come in the form of low compression ratios, but it has become a lesser issue, thanks to the recent advent of new high-bandwidth wireless networking technologies, such as the 802.11ax (WiFi 6) and 5G standards. Regardless, they still have only an enabling role, as general-purpose platforms still depend on largely software-based network I/O solutions.
The objective of this work, is in studying the technical aspects of texture-compressed stream delivery, and finding the best strategies for the performance optimization of the Linux kernel network stacks using general-purpose hardware. A video streaming pipeline optimized for texture formats is proposed, utilizing multi-threading, GPU acceleration, and network stack performance tuning.
As a result, the pipeline was found to be capable of reaching very low latencies in the case of high-bandwidth networks, when extrapolating from performance measurements in localhost TCP and UDP tests. As an example, a 2160p frame encoded in the BC1 format, could be delivered with a total end-to-end latency of under 10 ms, although it would require a 10 Gbit/s network and a high core count\dashspace{}-CPU. The achieved bandwidths were 31.7 Gbit/s and 25.3 Gbit/s for the proposed TCP and UDP implementations respectively. As the latency is roughly proportional to the frame size and network bandwidth, using a higher compression ratio format or more bandwidth could easily bring the 4320p performance to a similar level with the 2160p results.
The use of texture compression in video delivery was concluded to be on the edge of viability for the aforementioned low-power systems in wireless networks. The limiting factors are the network performance and the CPU-overheads in the Linux network stack. While significant improvements in device compute performances are unlikely to be seen in the near future, advancements in the networking capabilities of consumer hardware could, however, be enough to make texture compressed video delivery a reality.