The VELOCITY Project

The VELOCITY Compiler Project aims to address computer architecture problems with a new approach to compiler organization. This compiler organization, embodied in the VELOCITY Compiler (and derivative run-time optimizers), enables true whole-program scope, practical iterative compilation, and smarter memory analysis. These properties make VELOCITY better at extracting threads, improving reliability, and enhancing security.

The focus of the VELOCITY compiler is the removal of impediments to optimization, whether they come from procedures formed for software-engineering instead of optimization concerns, optimizations and optimization orders that don't achieve optimal performance on all or even most codes, or a complex series of memory operations that leads to overly conservative analysis.

To avoid procedure limitations, VELOCITY allows optimizations to choose the scope at which they operate by removing procedure boundaries that are the limits at which most traditional optimizations operate. Optimizations then operate on regions of arbitrary code formed for the purpose of optimization rather than software-engineering. Even with this freedom, which optimizations and the order in which optimizations should be applied is not always obvious, as there are many dependencies between optimizations and their effectiveness varies based on properties of the code. VELOCITY deals with this by exploring the space of optimizations available, called optimization space exploration, and choosing the best set of optimizations and their order for each piece of code compiled.

VELOCITY is also designed to remove data depencences in code that hinder advanced optimizations, particularly memory dependences. Through the use of aggressive pointer and shape analysis, many memory dependences which traditionally inhibit optimizations are removed. Pointer analysis computes the set of memory locations each instruction may access. The result of this analysis is used to compute dependences between instructions and to provide valuable information for other analyses. Because our pointer analysis represents dynamically-allocated heap locations according to their static allocation sites, it cannot distinguish between elements of linked data structures. To do this, a more advanced memory analysis, shape analysis, is performed. It determines the "shape" properties of these data structures, e.g. whether the data structure is a tree or a list, whether it contains cycle, etc. Our shape analysis encodes the shape properties using recursive predicates written in separation logic, and leverages a machine learning technique to automatically synthesize these predicates.

When static analysis fails, VELOCITY can utilize profiling information to speculatively remove dependences, particularly memory dependences. Memory profilers provide useful information on the dynamic memory behavior of programs that static memory analysis can not provide. Velocity provides conventional memory profiling that reasons about individual loads and stores. However, this is often insufficient, as advanced optimizations need knowledge about how entire data structures are accessed and modified. SHARP (Shape Analysis using Runtime Profiling) is a memory profiler that identifies the recursive data structures created when a program is executed and provides useful information including the shape of the RDS (list, tree etc), the size of the data structure, the traversal pattern and the average access time of the data structure. The information provided by Sharp can be used to prefetch recursive data structures and potentially other optimizations that manipulate entire data structures.

Finally, VELOCITY is being used to advanced the state of automatic multi-threading. Our mutli-threading optimization, Decoupled Software Pipelining (DSWP), works by creating multiple threads while respecting dependences. Traditional automatic multi-threading tends to look for loops free of dependence recurences and would not operate on these codes. VELOCITY, on the other hand, embraces these recurences when looking for code to parallelize, allowing it to apply multi-threading optimizations to a broader variety of codes.

VELOCITY has also explored the removal of impediments to porting of a compiler to a new platform. Particularly, VELOCITY can reverse engineer the structural hazards from a processor to create a better scheduler without human intervention or reading of processor manuals.