Trace-Based Data Layout Optimizations for Multi-Core Processors [abstract] (SpringerLink)
Olga Golovanevsky, Alon Dayan, Ayal Zaks, and David Edelsohn
Proceedings of the 5th International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC), January 2010.
Winner Best Paper Award.
The focus of this paper is on cache-conscious data layout
optimizations. Although these optimizations have already been adopted
by industrial compilers, they were shown to be inefficient for multi-process
applications on multi-core platforms. Such factors as asymmetric
distribution of processes over hardware resources (cores, cpus or hardware
threads), along with their temporal migrations, unpredictably influence
optimization results. Herein we present a new methodology that
extends classical data layout optimizations to support multi-core architectures.
Based on data trace collection that reflects actual interleaving
of data accesses, this method aims to improve spatial locality of the data,
while mitigating potential false sharing events. Introduction of architectural
characteristics into an analysis phase further increases the accuracy
of data affinity estimation. Feasibility study of this method, applied to
multi-process webserver lighttpd on Power5 machine, not only showed
performance improvement, but also proved its suitability for incorporation
into an industrial compiler.