Speculative Decoupled Software Pipelining [abstract] (IEEE CS, PDF)
Neil Vachharajani, Ram Rangan, Easwaran Raman, Matthew J. Bridges, Guilherme Ottoni, and David I. August
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2007.
In recent years, microprocessor manufacturers have
shifted their focus from single-core to multicore processors. To
avoid burdening programmers with the responsibility of parallelizing
their applications, some researchers have advocated automatic thread
extraction. A recently proposed technique, Decoupled Software
Pipelining (DSWP), has demonstrated promise by partitioning loops into
long-running, fine-grained threads organized into a pipeline. Using a
pipeline organization and execution decoupled by inter-core
communication queues, DSWP offers increased execution efficiency that
is largely independent of inter-core communication latency. This paper proposes adding speculation to DSWP and evaluates an
automatic approach for its implementation. By speculating past
infrequent dependences, the benefit of DSWP is increased by making it
applicable to more loops, facilitating better balanced threads, and
enabling parallelized loops to be run on more cores. Unlike prior
speculative threading proposals, speculative DSWP focuses on breaking
dependence recurrences. By speculatively breaking these recurrences,
instructions that were formerly restricted to a single thread to
ensure decoupling are now free to span multiple threads. Using an
initial automatic compiler implementation and a validated processor
model, this paper demonstrates significant gains using speculation for
4-core chip multiprocessor models running a variety of codes.