Front Page

The Liberty Research Group

The Compiler Foundations Project

[Prakash Prabhu is responsible for this page]

Compiler Research to address emerging issues by eliminating conventional limitations

New challenges in computer architecture demand new roles for compilation systems. We have demonstrated that these systems can effectively address problems of reliability, security, and scalability, problems they were not originally designed to handle. Doing so required overcoming limitations fundamental to their design by leveraging our research in the understanding and manipulation of program dataflow [MICRO 2000], control flow [ISCA 1998, ISCA 1999, IEEE 2001, IEEE 1997, IJPP 1999, Book 2005, ESA 2004, ISSS 2001], and memory behavior [CGO 2005, CGO 2004, MSP 2005, PLDI 2007]. Modern compilers still use a decades old conventional compiler organization, even though it limits the effectiveness of most compiler optimizations. The research described in this section provides compiler optimizations with greater freedom and increased effectiveness in the presence of other, potentially interfering, optimizations. This research not only make compilers more effective, but also provides compiler researchers with a means to more thoroughly understand the strength of their work.

Smart Memory Analysis, Profiling, and Intelligent Speculation

One of the most important factors that hinder advanced compiler optimizations is spurious and infrequently occuring memory dependences. Traditionally, pointer analysis has been applied to remove memory dependences that inhibit optimizations. However, conventional pointer analysis works once on a high level intermediate representation (IR), and its results are conservatively maintained by all subsequent code transformations. This imposes a phase-ordering constraint, causing the precision of alias information to be diluted by code transformations. To overcome these limitations, we proposed Low-Level Pointer Analysis (VLLPA) [CGO 2005] that operates at the assembly level, yet is context sensitive and partially flow sensitive, and handles all features of all features of C, such as unions. A comparison with a state-of-the-art high-level-IR based pointer analysis [] shows that VLLPA is less precise on only 0.3% of memory operations and more precise on 26.8% of memory operations. Pointer analysis, while disambiguating pointers to scalar variables, cannot distinguish between elements of linked data structures. In order to do this, we designed and implemented shape analysis [PLDI 2007], a more advanced memory analysis technique that determines the "shape" properties of these data structures, e.g. whether the data structure is a tree or a list, whether it contains cycle, etc. Our shape analysis encodes the shape properties using recursive predicates written in separation logic, and leverages a machine learning technique to automatically synthesize these predicates.

Both pointer and shape analysis are static analysis solutions, which are necessarily conservative to produce sound results across all inputs to a program. However, emerging applications written in languages like C/C++ frequently contain pointer-based code marked by irregular accesses to memory and irregular control flow, whose runtime behavior varies widely based on input. In such cases, static analysis fails and we use memory profiling to provide useful information on the dynamic memory behavior of programs that static memory analysis can not provide. We have developed efficient profiling techniques that have minimal runtime overheads [CC 2006, MSP 2005, CGO 2004]. In particular these techniques rely on heuristics strategically select key memory dependences to profile, and, in the common case where aliasing does not occur, enable aggressive optimizations. In addition to profiling based on loads and stores, we have developed profiling techniques that identifies the recursive data structures created when a program is executed and provides useful information including the shape and size of the data structure, and the traversal pattern [MSP 2005, CGO 2004]. This information can be used to perform optimizations that manipulate entire data structures. In our work, we leverage this profiling information to speculatively remove memory dependences [Book 2005]. Speculation is performed intelligently, by removing only those memory dependences that inhibit particular optimizations of interest to a client.

Procedure Boundary Elimination

A particularly limiting feature of the conventional compilation framework is its procedure-based nature. This limits code transformations to within a single procedure's boundaries, which may not present an ideal scope for optimization. Although aggressive inlining and certain interprocedural optimization routines can alleviate this problem, these can be applied only sparingly, the former because it causes excessive code growth, and the latter because their compile-time requirements scale poorly with program size. A solution to this problem, is a new compiler approach called Procedure Boundary Elimination(PBE) [PLDI 2006].

PBE expands the scope of optimization by unifying an entire application's code into an optimizable program-wide representation, called the whole-program control-flow graph(WCFG). By employing an expanded version of Hank's region formation, the compiler can then divide the WCFG into more manageable compilation units. These new compilation units are chosen according to the needs of optimization, regardless of the original procedure boundaries. Since code duplication, such as that caused by inlining, can often increase optimization opportunities, PBE also employs a targeted code specialization phase, which causes code growth only where it is likely to produce actual benefits. Of course, the central challenge that this scheme faces is that subsequent optimization and analysis routines must be able to operate on arbitrary portions of the WCFG. This problem was solved by extending and generalizing interprocedural analysis algorithms and by properly encapsulating regions. Results confirm that PBE delivers, with reasonable compile times, the performance improvement of aggressive inlining and interprocedural optimization (10-50%), without its code growth (30-200%). Since PBE changes the way transformations interact with the code, we expect it to teach us quite a bit about existing compiler techniques and to inspire new ones.

Optimization-Space Exploration

With performance, reliability, security, power, and other demands placed on them, optimizing compilers must incorporate an ever larger number of increasingly complex transformation algorithms. Since code transformations may lead to performance pitfalls or interfere with subsequent transformations, compilers must employ predictive heuristics for optimization guidance. The unpredictability of optimization interaction and the irregularity of modern machines severely limits the accuracy of these heuristics. As a result, during heuristic and optimization tuning, compiler writers temper high variance optimizations, limit their use, or may even exclude them entirely. While this results in compilers with good average code quality, opportunities are missed in individual code segments. We developed an alternative to the traditionally tuned compilation process, called optimization-space exploration (OSE). An OSE compiler considers many promising transformation alternatives for each hot code segment and selects the best one according to a performance model. By evaluating transformation sequences a posteriori, OSE reduces the dependence on predictive heuristics. Unlike previous iterative compilation attempts, which suffer from unrealistic compile times (days or years) and focus on limited portions of the optimization process, OSE is the first iterative compilation technique fast enough and general enough for widespread use in general-purpose compilation. To achieve this performance, OSE narrows down the space of optimization options explored through judicious use of heuristics. With help from Intel, OSE was implemented and proven in Intel's Electron production compiler. OSE achieves significant performance improvement on Intel's Itanium, more than 20% in some cases over best prior results, with only a 50\% increase compile time [JILP 2005, CGO 2003].

Project Ph.D. Graduates

Selected Project Publications

All Project Publications

LAMPVIEW: A Loop-Aware Toolset for Facilitating Parallelization [abstract] (PDF)
Thomas Rorie Mason
Master's Thesis, Department of Electrical Engineering, Princeton University, August 2009.

Speculation
Neil Vachharajani and David I. August
Encyclopedia of Computer Science and Engineering (ISBN: 978-0471383932)
Edited by Benjamin W. Wah. John Wiley & Sons, Inc., January 2009.

Shape Analysis with Inductive Recursion Synthesis [abstract] (PDF)
Bolei Guo
Ph.D. Thesis, Department of Computer Science, Princeton University, June 2008.

Optimizations for the Memory Hierarchy
Easwaran Raman and David I. August
Compiler Design Handbook (ISBN: 978-1420043822)
Edited by Y. N. Srikant and Priti Shankar. CRC Press, December 2007.

Shape Analysis with Inductive Recursion Synthesis [abstract] (ACM DL, PDF)
Bolei Guo, Neil Vachharajani, and David I. August
Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI), June 2007.
Accept Rate: 25% (45/178).

Automatic Instruction Scheduler Retargeting by Reverse-Engineering [abstract] (ACM DL, PDF)
Matthew J. Bridges, Neil Vachharajani, Guilherme Ottoni, and David I. August
Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2006.
Accept Rate: 21% (36/169).

A Framework for Unrestricted Whole-Program Optimization [abstract] (ACM DL, PDF)
Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, and David I. August
Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2006.
Accept Rate: 21% (36/169).

Selective Runtime Memory Disambiguation in a Dynamic Binary Translator [abstract] (PDF)
Bolei Guo, Youfeng Wu, Cheng Wang, Matthew J. Bridges, Guilherme Ottoni, Neil Vachharajani, Jonathan Chang, and David I. August
Proceedings of the 15th International Conference on Compiler Construction (CC), March 2006.
Accept Rate: 23% (17/72).

Recursive Data Structure Profiling [abstract] (ACM DL, PDF)
Easwaran Raman and David I. August
Proceedings of the Third Annual ACM SIGPLAN Workshop on Memory Systems Performance (MSP), June 2005.
Accept Rate: 33% (8/24).

Branch Predication
David I. August
Speculative Execution in High Performance Computer Architectures (ISBN: 978-1584884477)
Edited by D. Kaeli and P.-C. Yew. CRC Press, May 2005.

Practical and Accurate Low-Level Pointer Analysis [abstract] (IEEE Xplore, PDF)
Bolei Guo, Matthew J. Bridges, Spyridon Triantafyllis, Guilherme Ottoni, Easwaran Raman, and David I. August
Proceedings of the Third International Symposium on Code Generation and Optimization (CGO), March 2005.
Accept Rate: 33% (25/75).

Compiler Optimization-Space Exploration [abstract] (PDF)
Spyridon Triantafyllis, Manish Vachharajani, and David I. August
The Journal of Instruction-level Parallelism (JILP), February 2005.
Accept Rate: 15%

Finding Dominators in Practice
Loukas Georgiadis, Renato F. Werneck, Robert E. Tarjan, Spyridon Triantafyllis, and David I. August
Proceedings of the 12th European Symposium on Algorithms, September 2004.

Exposing Memory Access Regularities Using Object-Relative Memory Profiling [abstract] (ACM DL, PDF)
Qiang Wu, Artem Pyatakov, Alexey N. Spiridonov, Easwaran Raman, Douglas W. Clark, and David I. August
Proceedings of the Second International Symposium on Code Generation and Optimization (CGO), March 2004.
Accept Rate: 31% (25/79).

Compiler Optimization-Space Exploration [abstract] (IEEE Xplore, PDF)
Spyridon Triantafyllis, Manish Vachharajani, Neil Vachharajani, and David I. August
Proceedings of the 2003 International Symposium on Code Generation and Optimization (CGO), March 2003.
Accept Rate: 35% (29/82).
Winner Best Paper Award.

Procedure Boundary Elimination for EPIC Compilers [abstract] (CiteSeerX, PDF)
Spyridon Triantafyllis, Manish Vachharajani, and David I. August
Proceedings of the Second Workshop on Explicitly Parallel Instruction Computer Architectures and Compiler Technology (EPIC), November 2002.

Program Decision Logic Optimization using Predication and Control Speculation [abstract] (IEEE Xplore, PDF, PostScript)
Wen-mei W. Hwu, David I. August, and John W. Sias
Proceedings of the IEEE, Volume 89, Number 11, November 2001.

Accurate and Efficient Predicate Analysis with Binary Decision Diagrams [abstract] (ACM DL, PDF, PostScript)
John W. Sias, Wen-mei W. Hwu, and David I. August
Proceedings of the 33rd International Symposium on Microarchitecture (MICRO), December 2000.
Accept Rate: 28% (31/110).

Systematic Compilation for Predicated Execution
David I. August
Ph.D. Thesis, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, May 2000.

The Partial Reverse If-Conversion Framework for Balancing Control Flow and Predication [abstract]
David I. August, Wen-mei W. Hwu, and Scott A. Mahlke
International Journal of Parallel Programming (IJPP), Volume 27, Number 5, October 1999. Invited.
Special issue composed of "outstanding papers" selected by the Program Committee of the 30th Annual ACM/IEEE International Symposium on Microarchitecture.

The Program Decision Logic Approach to Predicated Execution [abstract] (ACM DL, PDF, PostScript)
David I. August, John W. Sias, Jean-Michel Puiatti, Scott A. Mahlke, Daniel A. Connors, Kevin M. Crozier, and Wen-mei W. Hwu
Proceedings of the 26th International Symposium on Computer Architecture (ISCA), May 1999.
Accept Rate: 19% (26/135).

Integrated Predicated and Speculative Execution in the IMPACT EPIC Archtecture [abstract] (ACM DL, PDF, PostScript)
David I. August, Daniel A. Connors, Scott A. Mahlke, John W. Sias, Kevin M. Crozier, Ben-Chung Cheng, Patrick R. Eaton, Qudus B. Olaniran, and Wen-mei W. Hwu
Proceedings of the 25th International Symposium on Computer Architecture (ISCA), July 1998.
Accept Rate: 21% (33/156).

The IMPACT EPIC 1.0 Architecture and Instruction Set Reference Manual
David I. August, Daniel A. Connors, John W. Sias, Kevin M. Crozier, Patrick R. Eaton, Qudus B. Olaniran, and Wen-mei W. Hwu
University of Illinois at Urbana/Champaign Center for Reliable and High Performance Computing Technical Report CRHC-98-04, February 1998.

A Framework for Balancing Control Flow and Predication [abstract] (IEEE Xplore, PDF, PostScript)
David I. August, Wen-mei W. Hwu, and Scott A. Mahlke
Proceedings of the 30th International Symposium on Microarchitecture (MICRO), December 1997.
Accept Rate: 33% (35/103).
Selected as an "outstanding paper" for inclusion in a special issue of the International Journal of Parallel Programming (IJPP) by the Program Committee.

Architectural Support Compiler-Synthesized Dynamic Branch Prediction Strategies: Rationale and Initial Results [abstract] (IEEE Xplore, PDF, PostScript)
David I. August, Daniel A. Connors, John C. Gyllenhaal, and Wen-mei W. Hwu
Proceedings of the Third International Symposium on High-Performance Computer Architecture (HPCA), February 1997.
Accept Rate: 19% (30/152).

Compiler Technology for Future Microprocessors [abstract] (IEEE Xplore, PDF, PostScript)
Wen-mei W. Hwu, Richard E. Hank, David M. Gallagher, Scott A. Mahlke, Daniel M. Lavery, Grant E. Haab, John C. Gyllenhaal, and David I. August
Proceedings of the IEEE, Volume 83, Number 12, December 1995.

A Comparison of Full and Partial Predicated Execution Support for ILP Processors [abstract] (ACM DL, PDF, PostScript)
Scott A. Mahlke, Rick E. Hank, James E. McCormick, David I. August, and Wen-mei W. Hwu
Proceedings of the 22nd International Symposium on Computer Architecture (ISCA), June 1995.
Accept Rate: 20% (37/180).

Sentinel Scheduling with Recovery Blocks
David I. August, Brian L. Deitrich, and Scott A. Mahlke
University of Illinois at Urbana/Champaign Center for Reliable and High Performance Computing Technical Report CRHC-95-05, February 1995.