CGPA:Coarse-Grained Pipelined Accelerators [abstract] (PDF)
Feng Liu, Soumyadeep Ghosh, Nick P. Johnson, and David I. August
The Design Automation Conference (DAC), June 2014.

High-level synthesis (HLS) tools dramatically reduce the nonrecurring engineering cost of creating specialized hardware accelerators. Existing HLS tools are successful in synthesizing efficient accelerators for program kernels with regular memory accesses and simple control flows. For other programs, however, these tools yield poor performance because they invoke computation units for instructions sequentially, without exploiting parallelism. To address this problem, this paper proposes Coarse-Grained Pipelined Accelerators (CGPA), an HLS framework that utilizes coarse-grained pipeline parallelism techniques to synthesize efficient specialized accelerator modules from irregular C/C++ programs without requiring any annotations. Compared to the sequential method, CGPA shows speedups of 3.0x–3.8x for 5 kernels from programs in different domains.