Edge-centric modulo scheduling for coarse-grained reconfigurable architectures [abstract] (ACM DL)
Hyunchul Park, Kevin Fan, Scott A. Mahlke, Taewook Oh, Heeseok Kim, and Hong-seok Kim
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2008.
Coarse-grained reconfigurable architectures (CGRAs) present an appealing
hardware platform by providing the potential for high computation throughput,
scalability, low cost, and energy efficiency. CGRAs consist of an array of
function units and register files often organized as a two dimensional grid. The
most difficult challenge in deploying CGRAs is compiler scheduling technology
that can efficiently map software implementations of compute intensive loops
onto the array. Traditional schedulers focus on the placement of operations in
time and space. With CGRAs, the challenge of placement is compounded by the need
to explicitly route operands from producers to consumers. To systematically
attack this problem, we take an edge-centric approach to modulo scheduling that
focuses on the routing problem as its primary objective. With edge-centric
modulo scheduling (EMS), placement is a by-product of the routing process, and
the schedule is developed by routing each edge in the dataflow graph. Routing
cost metrics provide the scheduler with a global perspective to guide selection.
Experiments on a wide variety of compute-intensive loops from the multimedia
domain show that EMS improves throughput by 25% over traditional iterative
modulo scheduling, and achieves 98% of the throughput of simulated annealing
techniques at a fraction of the compilation time.