Automatic Exploitation of Input Parallelism [abstract] (PDF)
Taewook Oh
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2015.

Parallelism may reside in the input of a program rather than the program itself. A script interpreter, for example, is hard to parallelize because its dynamic behavior is unpredictable until an input script is given. Once the interpreter is combined with the script, the resulting program becomes predictable, and even parallelizable if the input script contains parallelism. Despite recent progress in automatic parallelization research, however, existing techniques cannot take advantage of the parallelism within program inputs, even when the inputs remain fixed across multiple executions of the program.

This dissertation shows that the automatic exploitation of parallelism within fixed program inputs can be achieved by coupling program specialization with automatic parallelization techniques. Program specialization marries a program with the values that remain invariant across the program execution, including fixed inputs, and creates a program that is highly optimized for the invariants. The proposed technique exploits program specialization as an enabling transformation for automatic parallelization; through specialization, the parallelism within the fixed program inputs can be materialized within the specialized program.

First, this dissertation presents Invariant-induced Pattern-based Loop Specialization (IPLS). IPLS folds the parallelism within the program invariants into the specialized program, thereby creating a more complete and predictable program that is easier to parallelize. Second, this dissertation applies automatic speculative parallelization techniques to specialized programs to exploit parallelism in inputs. As existing techniques fail to extract parallelism from complex programs such as IPLS specialized programs, context-sensitive speculation and optimized design of the speculation run-time system are proposed to improve the applicability and minimize the execution overhead of the parallelized program.

A prototype of the proposed technique is evaluated against two widely-used open-source script interpreters. Experimental results demonstrate the effectiveness of the proposed techniques.