Liberty Queues for EPIC Architectures [abstract] (PDF)
Thomas B. Jablin, Yun Zhang, James A. Jablin, Jialu Huang, Hanjun Kim, and David I. August
Proceedings of the Eighth Workshop on Explicitly Parallel
Instruction Computer Architectures and Compiler Technology (EPIC), April 2010.
Core-to-core communication bandwidth is critical for
high-performance pipeline-parallel programs. Hardware communication queues are
unlikely to be implemented and are perhaps unnecessary. This paper presents
Liberty Queues, a high-performance lock-free software-only ring buffer, and
describes the porting effort from the original x86-64 implementation to
IA-64. Liberty Queues achieve a bandwidth of 500 MB/s between unrelated
processors on a first generation Itanium 2, compared with 281 MB/s on modern
Opterons and 430 MB/s on modern Xeons claimed by related works. Bandwidth
results are presented for seven different multicore and multiprocessor systems,
as well as a sensitivity analysis.