A Compiler Framework for a Coarse-Grained Reconfigurable Array
Valderas Rodríguez, Leticia Trinidad (2015)
Valderas Rodríguez, Leticia Trinidad
2015
Master's Degree Programme in Electrical Engineering
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2015-08-12
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201507291448
https://urn.fi/URN:NBN:fi:tty-201507291448
Tiivistelmä
The number of transistors on a chip is increasing with time giving rise to multiple design challenges. In this context, reconfigurable architectures have emerged to provide high flexibility, less power/energy consumption yet while delivering high performance levels. The success of an embedded architecture depends on powerful compiler support. Current studies are focused on developing compilers to reduce the designer’s effort by introducing many automation related features. In this thesis work, a compiler framework is presented for a scalable Coarse-Grained Reconfigurable Array (CGRA) called SCREMA.
The compiler framework presented in this thesis replaces the exiting GUI compiler with an added feature of automatic placement and routing. The compiler receives a Reverse Polish Notation (RPN) description of the target algorithm by the user. It extracts the computational information from the RPN description and performs placement and routing over the CGRA template. The first configuration stream generated by the compiler is the main processing context. Furthermore, if additional configuration patterns have to be designed, the compiler framework gives the possibility to implement them in two different design paradigms: a preprocessing context and a canonical context. Pre-processing context is used to align the data into a CGRA to facilitate post-processing. Canonical context allows the user to perform additions in sum-of-products related algorithms.
The compiler framework has been tested by implementing real integer Matrix-Vector Multiplication (MVM) algorithms. Specifically, the tested MVM orders are 4th, 8th, 16th and 32nd on the CGRA sizes of 4x4, 4x8, 4x16 and 4x32 PEs, respectively. All the implementation are based on the RPN description of 4th-order MVM. Other than implementing 4th-order MVM, the rest of tested MVM algorithms need preprocessing and canonical contexts to be designed and implemented. The user effort which was needed to Place and Route (P&R) an algorithm manually on SCREMA is now reduced by using this compiler framework as it provides an automatic P&R mechanism.
The compiler framework presented in this thesis replaces the exiting GUI compiler with an added feature of automatic placement and routing. The compiler receives a Reverse Polish Notation (RPN) description of the target algorithm by the user. It extracts the computational information from the RPN description and performs placement and routing over the CGRA template. The first configuration stream generated by the compiler is the main processing context. Furthermore, if additional configuration patterns have to be designed, the compiler framework gives the possibility to implement them in two different design paradigms: a preprocessing context and a canonical context. Pre-processing context is used to align the data into a CGRA to facilitate post-processing. Canonical context allows the user to perform additions in sum-of-products related algorithms.
The compiler framework has been tested by implementing real integer Matrix-Vector Multiplication (MVM) algorithms. Specifically, the tested MVM orders are 4th, 8th, 16th and 32nd on the CGRA sizes of 4x4, 4x8, 4x16 and 4x32 PEs, respectively. All the implementation are based on the RPN description of 4th-order MVM. Other than implementing 4th-order MVM, the rest of tested MVM algorithms need preprocessing and canonical contexts to be designed and implemented. The user effort which was needed to Place and Route (P&R) an algorithm manually on SCREMA is now reduced by using this compiler framework as it provides an automatic P&R mechanism.