From run-time reconfigurable coarse-grain arrays to application-specific accelerator design
Garzia, F. (2009)
Garzia, F.
Tampere University of Technology
2009
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-200911207152
https://urn.fi/URN:NBN:fi:tty-200911207152
Tiivistelmä
This Thesis focuses on the acceleration of different applications using a run-time reconfigurable array. In the cases under study, the author analyzes the main causes of performance degradation, which are classified as external and internal causes.
The author proposes different solutions to reduce the impact of external causes. The adoption of mechanisms to reduce the external overheads give improvements of 10X in average. The logic to reduce the communication overhead occupies 0.6% of the total area but gives a speed-up of 15X for the transfer speed and 3X reduction of the overall cost of the transfers. The new reconfiguration infrastructure gives an 8% improvement for the maximum working frequency. The dynamic reconfiguration allows to hide the cost of reconfiguration behind the CREMA processing activity. The same consideration applies for the control operations.
In addition the author considers the internal causes of performance degradation and proposes a new model that can be easily adapted to a chosen application. For this purpose the author presents a template called CREMA, that can be tailored to the application requirements, but keeps the possibility to share its internal resources using run-time reconfiguration. This new model can be used as a method to realize application-specific accelerators. The new design presents an application-specific accelerator that is 3X-4.5X smaller than the previous general-purpose device and 1.5X-5X faster. Mapping of SDR kernels shows figures that approach the real-time specifications.
The author proposes different solutions to reduce the impact of external causes. The adoption of mechanisms to reduce the external overheads give improvements of 10X in average. The logic to reduce the communication overhead occupies 0.6% of the total area but gives a speed-up of 15X for the transfer speed and 3X reduction of the overall cost of the transfers. The new reconfiguration infrastructure gives an 8% improvement for the maximum working frequency. The dynamic reconfiguration allows to hide the cost of reconfiguration behind the CREMA processing activity. The same consideration applies for the control operations.
In addition the author considers the internal causes of performance degradation and proposes a new model that can be easily adapted to a chosen application. For this purpose the author presents a template called CREMA, that can be tailored to the application requirements, but keeps the possibility to share its internal resources using run-time reconfiguration. This new model can be used as a method to realize application-specific accelerators. The new design presents an application-specific accelerator that is 3X-4.5X smaller than the previous general-purpose device and 1.5X-5X faster. Mapping of SDR kernels shows figures that approach the real-time specifications.
Kokoelmat
- Väitöskirjat [4985]