Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

On the OpenCL Support for Streaming Fixed-Function Accelerators on Embedded SoC FPGAs

Mousouliotis, Panagiotis; Leppänen, Topi; Jääskeläinen, Pekka; Petrellis, Nikos; Christakos, Panagiotis; Keramidas, Georgios; Antonopoulos, Christos; Voros, Nikolaos (2023)

 
Avaa tiedosto
paper02.pdf (1.137Mt)
Lataukset: 



Mousouliotis, Panagiotis
Leppänen, Topi
Jääskeläinen, Pekka
Petrellis, Nikos
Christakos, Panagiotis
Keramidas, Georgios
Antonopoulos, Christos
Voros, Nikolaos
2023

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
doi:10.1007/978-3-031-42921-7_4
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2023122111169

Kuvaus

Peer reviewed
Tiivistelmä
<p>OpenCL is used in contemporary FPGA High-level Synthesis (HLS) design tools for the development of the host-side code which controls the data transfer between the processing system and the FPGA design. High performance FPGA designs in embedded SoC FPGAs often make use of data movers with streaming capabilities for the direct data transfer between the host’s main memory and the local memory of the FPGA accelerator. Unfortunately, the OpenCL memory model does not currently support streaming data movement between the host system and the FPGA accelerator. Earlier work has shown up to 8x latency improvement in data transfer when streaming data movement is used. To emphasize on this important issue, this work extends the Portable Computing Language (PoCL) OpenCL framework to support direct streaming data movement between the host’s main memory and the accelerator’s local memory. Furthermore, this work uses the CNN-Grinder workflow to map the execution of a traffic sign recognition Convolutional Neural Network (CNN) on the SqueezeJet-3 FPGA accelerator in order to showcase the details of controlling the SqueezeJet-3 streaming accelerator from a PoCL application. Results show that it is possible to achieve high performance accelerator execution and efficiently control an FPGA streaming accelerator on an embedded SoC FPGA using OpenCL augmented with direct streaming data transfer capabilities between the host and the kernel.</p>
Kokoelmat
  • TUNICRIS-julkaisut [20247]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste