Design and implementation of parallel memory architectures
Aho, E. (2006)
Aho, E.
Tampere University of Technology
2006
Tietotekniikan osasto - Department of Information Technology
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-200810021038
https://urn.fi/URN:NBN:fi:tty-200810021038
Tiivistelmä
Parallel processing is continually concerned about how to supply all the processing nodes with data. Many of the applications favor special data patterns that could be accessed in parallel. This is utilized in parallel memories, where the idea is to increase memory bandwidth with several memory modules working in parallel and feed the processor with only necessary data.
Traditional parallel memories are application specific and support only fixed data access requirements. In this Thesis, memory flexibility is increased to give support for several algorithms by adding run-time configurability to parallel memories. Multitude of data access templates and module assignment functions can be used within a single hardware implementation, which has not been possible in prior embedded parallel memory systems. The design reusability of the memories is also improved since the same memory system is applicable in several separate implementations.
Three novel parallel memory architectures are presented in this Thesis: one traditional application specific type and two with run-time configurability. The results show that run-time configurability can be included in parallel memories with a reasonable cost. As a case study with four memory modules, the normalized complexity of the proposed configurable parallel memories is 63 80% less than the conventional type of parallel memory. Moreover, in configurable parallel memories, the complexity increase in permutation networks is expressed to become the most critical when increasing the memory module count. According to evaluations, up to 79% of the total parallel memory gate count is consumed by the permutation networks excluding memory cells.
The results of this Thesis can be used for designing a flexible memory system for real-time multimedia applications that demand high data throughput and data parallel computation.
Traditional parallel memories are application specific and support only fixed data access requirements. In this Thesis, memory flexibility is increased to give support for several algorithms by adding run-time configurability to parallel memories. Multitude of data access templates and module assignment functions can be used within a single hardware implementation, which has not been possible in prior embedded parallel memory systems. The design reusability of the memories is also improved since the same memory system is applicable in several separate implementations.
Three novel parallel memory architectures are presented in this Thesis: one traditional application specific type and two with run-time configurability. The results show that run-time configurability can be included in parallel memories with a reasonable cost. As a case study with four memory modules, the normalized complexity of the proposed configurable parallel memories is 63 80% less than the conventional type of parallel memory. Moreover, in configurable parallel memories, the complexity increase in permutation networks is expressed to become the most critical when increasing the memory module count. According to evaluations, up to 79% of the total parallel memory gate count is consumed by the permutation networks excluding memory cells.
The results of this Thesis can be used for designing a flexible memory system for real-time multimedia applications that demand high data throughput and data parallel computation.
Kokoelmat
- Väitöskirjat [4608]