Sprint: Speculative Prefetching of Remote Data [abstract] (PDF)
Arun Raman, Greta Yorsh, Martin Vechev, and Eran Yahav
Proceedings of the 26th Annual ACM SIGPLAN Conference on
Object-Oriented Programming, Systems, Languages, and
Applications (OOPSLA), October 2011.
Based on work done during internship at IBM Research in
Summer 2010.
Remote data access latency is a significant performance bottleneck
in many modern programs that use remote databases and web services.
We present Sprint---a run-time system for optimizing such programs
by prefetching and caching data from remote sources in parallel to
the execution of the original program. Sprint separates the concerns
of exposing potentially-independent data accesses from the mechanism
for executing them efficiently in parallel or in a batch. In
contrast to prior work, Sprint can efficiently prefetch data in
the presence of irregular or input-dependent access patterns,
while preserving the semantics of the original program.
We used Sprint to automatically improve the performance of several
real-world Java programs that access remote databases (MySQL, DB2)
and web services (Facebook, IBM's Yellow Pages). Sprint achieves
speedups ranging 2.4x to 15.8x over sequential execution, which are
comparable to those achieved by manually modifying the program for
asynchronous and batch execution of data accesses. Sprint provides a
simple interface that allows a programmer to plug in support for
additional data sources without modifying the client program.