Global Multi-Threaded Instruction Scheduling [abstract] (IEEE Xplore, PDF)
Guilherme Ottoni and David I. August
Proceedings of the 40th IEEE/ACM International Symposium on
Microarchitecture (MICRO), December 2007.
Recently, the microprocessor industry has moved toward chip
multiprocessor (CMP) designs as a means of utilizing the increasing
transistor counts in the face of physical and micro-architectural
limitations. Despite this move, CMPs do not directly improve the
performance of single-threaded codes, a characteristic of most
applications. In order to support parallelization of general-purpose
applications, computer architects have proposed CMPs with light-weight
scalar communication mechanisms. Despite such support, most existing
compiler multi-threading techniques have generally demonstrated little
effectiveness in extracting parallelism from non-scientific
applications. The main reason for this is that such techniques are
mostly restricted to extracting parallelism within straight-line
regions of code.
In this paper, we first propose a framework that enables global
multi-threaded instruction scheduling in general. We then describe
GREMIO, a scheduler built using this framework. GREMIO operates at a
global scope, at the procedure level, and uses control dependence
analysis to extract non-speculative thread-level parallelism from
sequential codes. Using a fully automatic compiler implementation of
GREMIO and a validated processor model, this paper demonstrates gains
for a dual-core CMP model running a variety of codes. Our experiments
demonstrate the advantage of exploiting global scheduling for
multi-threaded architectures, and present gains in a detailed
comparison with the Decoupled Software Pipelining (DSWP)
multi-threading technique. Furthermore, our experiments show that
adding GREMIO to a compiler with DSWP improves the average speedup
from 16.5% to 32.8% for important benchmark functions when utilizing
two cores, indicating the importance of this technique in making
compilers extract threads effectively.