Parallelization Project

All Project Publications

The Parallel-Semantics Program Dependence Graph for Parallel Optimizaton [abstract]
Yian Su, Brian Homerding, Haocheng Gao, Federico Sossai, Yebin Chon, David I. August, and Simone Campanoni
Proceedings of the 2026 International Symposium on Code Generation and Optimization, February 2026.
Accept Rate: 35% (45/128).

Revisiting Computation for Research: Practices and Trends [abstract] (PDF)
Jeremiah Giordani*, Ziyang Xu*, Ella Colby, August Ning, Bhargav Reddy Godala, Ishita Chaturvedi, Shaowei Zhu, Yebin Chon, Greg Chan, Zujun Tan, Galen Collier, Jonathan D. Halverson, Enrico Armenio Deiana, Jasper Liang, Federico Sossai, Yian Su, Atmn Patel, Bangyen Pham, Nathan Greiner, Simone Campanoni, and David I. August
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), November 2024.
Accept Rate: 22% (102/449).
(*Co-first authors)

PROMPT: A Fast and Extensible Memory Profiling Framework [abstract] (PDF)
Ziyang Xu, Yebin Chon, Yian Su, Zujun Tan, Sotiris Apostolakis, Simone Campanoni, and David I. August
Proceedings of the ACM on Programming Languages, Volume 8, Issue OOPSLA (OOPSLA), October 2024.
Accept Rate: 30% (38/123).

PROMPT: A Fast and Extensible Memory Profiling Framework
Ziyang Xu, Yebin Chon, Yian Su, Zujun Tan, Sotiris Apostolakis, Simone Campanoni, and David I. August
Available at https://github.com/PrincetonUniversity/PROMPT/, January 2024.

Transpilation Utilizing Language-Agnostic IR and Interactivity for Parallelization [abstract] (PDF)
Zujun Tan
Ph.D. Thesis, Department of Computer Science, Princeton University, 2024.

SPLENDID: Supporting Parallel LLVM-IR Enhanced Natural Decompilation for Interactive Development [abstract] (PDF)
Zujun Tan, Yebin Chon, Michael Kruse, Johannes Doerfert, Ziyang Xu, Brian Homerding, Simone Campanoni, and David I. August
Proceedings of the Twenty-Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2023.
Accept Rate: 26% (72/270).
Awarded all top ACM Reproducibility Badges offered by the Artifact Evaluation Committee.

LAMP: A Practical Loop-Aware Memory Dependence Profiler [abstract]
Yebin Chon, Ziyang Xu, Zujun Tan, and David I. August
The Fourth Young Architect Workshop (YArch), March 2022.

NOELLE Offers Empowering LLVM Extensions [abstract] (PDF)
Angelo Matni, Enrico Armenio Deiana, Yian Su, Lukas Gross, Souradip Ghosh, Sotiris Apostolakis, Ziyang Xu, Zujun Tan, Ishita Chaturvedi, Brian Homerding, Tommy McMichen, David I. August, and Simone Campanoni
Proceedings of the 2022 International Symposium on Code Generation and Optimization, February 2022.
Accept Rate: 27% (27/99).
Awarded ACM Artifact Available, Artifact Functional, and Result Reproduced Badges.

Improving Instruction Cache Performance For Modern Processors With Growing Workloads [abstract] (PDF)
Nayana Prasad Nagendra
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2021.

NOELLE Offers Empowering LLVM Extensions [abstract] (arXiv, PDF)
Angelo Matni, Enrico Armenio Deiana, Yian Su, Lukas Gross, Souradip Ghosh, Sotiris Apostolakis, Ziyang Xu, Zujun Tan, Ishita Chaturvedi, David I. August, and Simone Campanoni
arXiv:2102.05081 [cs.PL], February 2021.

A Sensible Approach to Speculative Automatic Parallelization [abstract] (PDF)
Sotiris Apostolakis
Ph.D. Thesis, Department of Computer Science, Princeton University, January 2021.

SCAF: A Speculation-Aware Collaborative Dependence Analysis Framework (ACM DL)
Sotiris Apostolakis, Ziyang Xu, Zujun Tan, Greg Chan, Simone Campanoni, and David I. August
Available at https://liberty.princeton.edu/Projects/AutoPar/SCAF/, December 2020.

NOELLE Offers Empowering LLVM Extensions
Sotiris Apostolakis, Ziyang Xu, Zujun Tan, David I. August, and Simone Campanoni
Available at https://liberty.princeton.edu/Projects/NOELLE/, December 2020.

Perspective: A Sensible Approach to Speculative Automatic Parallelization
Sotiris Apostolakis, Ziyang Xu, Zujun Tan, Greg Chan, Simone Campanoni, and David I. August
Available at https://liberty.princeton.edu/Projects/AutoPar/Perspective/, December 2020.

SCAF: A Speculation-Aware Collaborative Dependence Analysis Framework [abstract] (ACM DL, PDF)
Sotiris Apostolakis, Ziyang Xu, Zujun Tan, Greg Chan, Simone Campanoni, and David I. August
Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2020.
Accept Rate: 22% (77/341).
Awarded all top ACM Reproducibility Badges offered by the Artifact Evaluation Committee.

Automatic and Speculative Parallel-Stage Decoupled Software Pipelining [abstract]
Zujun Tan, Greg Chan, Ziyang Xu, Sotiris Apostolakis, and David I. August
The Second Young Architect Workshop (YArch), March 2020.

Perspective: A Sensible Approach to Speculative Automatic Parallelization [abstract] (ACM DL, PDF)
Sotiris Apostolakis, Ziyang Xu, Greg Chan, Simone Campanoni, and David I. August
Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2020.
Accept Rate: 18% (86/476).
Awarded all top ACM Reproducibility Badges offered by the Artifact Evaluation Committee.

Hardware MultiThreaded Transactions: Enabling Speculative MultiThreaded Pipeline Parallelization For Complex Programs [abstract] (PDF)
Jordan Fix
Ph.D. Thesis, Department of Computer Science, Princeton University, 2020.

Collaborative Parallelization Framework [abstract]
Ziyang Xu, Greg Chan, Sotiris Apostolakis, and David I. August
The First Young Architect Workshop (YArch), February 2019.

MemoDyn: Exploiting Weakly Consistent Data Structures for Dynamic Parallel Memoization [abstract] (ACM DL, PDF)
Prakash Prabhu, Stephen R. Beard, Sotiris Apostolakis, Ayal Zaks, and David I. August
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques (PACT), November 2018.
Accept Rate: 28% (36/126).

Hardware MultiThreaded Transactions [abstract] (ACM DL, PDF)
Jordan Fix, Nayana P. Nagendra, Sotiris Apostolakis, Hansen Zhang, Sophie Qiu, and David I. August
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2018.
Accept Rate: 17% (56/319).

A Collaborative Dependence Analysis Framework [abstract] (ACM DL, PDF)
Nick P. Johnson, Jordan Fix, Taewook Oh, Stephen R. Beard, Thomas B. Jablin, and David I. August
Proceedings of the 2017 International Symposium on Code Generation and Optimization (CGO), February 2017.
Accept Rate: 22% (26/114).
Highest ranked paper in double-blind review process.

Static Dependence Analysis in an Infrastructure for Automatic Parallelization [abstract] (PDF)
Nick P. Johnson
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2015.

Automatic Exploitation of Input Parallelism [abstract] (PDF)
Taewook Oh
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2015.

Automatically Exploiting Cross-Invocation Parallelism Using Runtime Information [abstract] (PDF)
Jialuh Huang
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2013.

Semantic Language Extensions for Implicit Parallel Programming [abstract] (PDF)
Prakash Prabhu
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2013.

ASAP: Automatic Speculative Acyclic Parallelization for Clusters [abstract] (PDF)
Hanjun Kim
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2013.

Automatic Parallelization for GPUs [abstract] (PDF)
Thomas B. Jablin
Ph.D. Thesis, Department of Computer Science, Princeton University, April 2013.

Automatically Exploiting Cross-Invocation Parallelism Using Runtime Information [abstract] (PDF)
Jialu Huang, Thomas B. Jablin, Stephen R. Beard, Nick P. Johnson, and David I. August
Proceedings of the 2013 International Symposium on Code Generation and Optimization (CGO), February 2013.
Accept Rate: 28% (33/117).

Parcae: A System for Flexible Parallel Execution [abstract] (PDF)
Arun Raman, Ayal Zaks, Jae W. Lee, and David I. August
Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2012.
Accept Rate: 18% (48/255).

Speculative Separation for Privatization and Reductions [abstract] (ACM DL, PDF)
Nick P. Johnson, Hanjun Kim, Prakash Prabhu, Ayal Zaks, and David I. August
Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2012.
Accept Rate: 18% (48/255).

Dynamically Managed Data for CPU-GPU Architectures [abstract] (PDF)
Thomas B. Jablin, James A. Jablin, Prakash Prabhu, Feng Liu, and David I. August
Proceedings of the 2012 International Symposium on Code Generation and Optimization (CGO), March 2012.
Accept Rate: 28% (26/90).

Automatic Speculative DOALL for Clusters [abstract] (PDF)
Hanjun Kim, Nick P. Johnson, Jae W. Lee, Scott A. Mahlke, and David I. August
Proceedings of the 2012 International Symposium on Code Generation and Optimization (CGO), March 2012.
Accept Rate: 28% (26/90).

A System for Flexible Parallel Execution [abstract] (PDF)
Arun Raman
Ph.D. Thesis, Department of Electrical Engineering, Princeton University, December 2011.

Automatic Extraction of Parallelism from Sequential Code
David I. August, Jialu Huang, Thomas B. Jablin, Hanjun Kim, Thomas R. Mason, Prakash Prabhu, Arun Raman, and Yun Zhang
Fundamentals of Multicore Software Development (ISBN: 978-1439812730)
Edited by Ali-Reza Adl-Tabatabai, Victor Pankratius, and Walter Tichy. Chapman & Hall / CRC Press, December 2011.

A Survey of the Practice of Computational Science [abstract] (ACM DL, PDF)
Prakash Prabhu, Thomas B. Jablin, Arun Raman, Yun Zhang, Jialu Huang, Hanjun Kim, Nick P. Johnson, Feng Liu, Soumyadeep Ghosh, Stephen Beard, Taewook Oh, Matthew Zoufaly, David Walker, and David I. August
Proceedings of the 24th ACM/IEEE Conference on High Performance Computing, Networking, Storage and Analysis (SC), November 2011.

Commutative Set: A Language Extension for Implicit Parallel Programming [abstract] (ACM DL, PDF)
Prakash Prabhu, Soumyadeep Ghosh, Yun Zhang, Nick P. Johnson, and David I. August
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2011.
Accept Rate: 23% (55/236).

Automatic CPU-GPU Communication Management and Optimization [abstract] (ACM DL, PDF)
Thomas B. Jablin, Prakash Prabhu, James A. Jablin, Nick P. Johnson, Stephen R. Beard, and David I. August
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2011.
Accept Rate: 23% (55/236).

Parallelism Orchestration using DoPE: the Degree of Parallelism Executive [abstract] (ACM DL, PDF)
Arun Raman, Hanjun Kim, Taewook Oh, Jae W. Lee, and David I. August
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2011.
Accept Rate: 23% (55/236).

Scalable Speculative Parallelization on Commodity Clusters [abstract] (ACM DL, PDF)
Hanjun Kim, Arun Raman, Feng Liu, Jae W. Lee, and David I. August
Proceedings of the 43rd IEEE/ACM International Symposium on Microarchitecture (MICRO), December 2010.
Accept Rate: 18% (45/248).
Highest ranked paper in double-blind review process.

Programming Multicores: Do Applications Programmers Need to Write Explicitly Parallel Programs? [abstract] (IEEE Xplore, PDF)
Arvind, David I. August, Keshav Pingali, Derek Chiou, Resit Sendag, and Joshua J. Yi
IEEE Micro, Volume 30, Number 3, May 2010.

Decoupled Software Pipelining Creates Parallelization Opportunities [abstract] (ACM DL, PDF)
Jialu Huang, Arun Raman, Yun Zhang, Thomas B. Jablin, Tzu-Han Hung, and David I. August
Proceedings of the 2010 International Symposium on Code Generation and Optimization (CGO), April 2010.
Accept Rate: 41% (29/70).

Liberty Queues for EPIC Architectures [abstract] (PDF)
Thomas B. Jablin, Yun Zhang, James A. Jablin, Jialu Huang, Hanjun Kim, and David I. August
Proceedings of the Eighth Workshop on Explicitly Parallel Instruction Computer Architectures and Compiler Technology (EPIC), April 2010.

Speculative Parallelization Using Software Multi-threaded Transactions [abstract] (ACM DL, PDF)
Arun Raman, Hanjun Kim, Thomas R. Mason, Thomas B. Jablin, and David I. August
Proceedings of the Fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2010.
Accept Rate: 17% (32/181).

Compilation Strategies and Challenges for Multicore Signal Processing [abstract] (IEEE Xplore, PDF)
Mojtaba Mehrara, Thomas B. Jablin, Dan Upton, David I. August, Kim Hazelwood, and Scott Mahlke
IEEE Signal Processing Magazine, November 2009.

LAMPVIEW: A Loop-Aware Toolset for Facilitating Parallelization [abstract] (PDF)
Thomas Rorie Mason
Master's Thesis, Department of Electrical Engineering, Princeton University, August 2009.

Parallelization Techniques with Improved Dependence Handling [abstract] (PDF)
Easwaran Raman
Ph.D. Thesis, Department of Computer Science, Princeton University, June 2009.

Intelligent Speculation for Pipelined Multithreading [abstract] (PDF)
Neil Amar Vachharajani
Ph.D. Thesis, Department of Computer Science, Princeton University, November 2008.

The VELOCITY Compiler: Extracting Efficient Multicore Execution from Legacy Sequential Programs [abstract] (PDF)
Matthew John Bridges
Ph.D. Thesis, Department of Computer Science, Princeton University, November 2008.

Global Instruction Scheduling for Multi-Threaded Architectures [abstract] (PDF)
Guilherme de Lima Ottoni
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2008.

Performance Scalability of Decoupled Software Pipelining [abstract] (ACM DL, PDF)
Ram Rangan, Neil Vachharajani, Guilherme Ottoni, and David I. August
ACM Transactions on Architecture and Code Optimization (TACO), Volume 5, Number 2, August 2008.

Spice: Speculative Parallel Iteration Chunk Execution [abstract] (ACM DL, PDF)
Easwaran Raman, Neil Vachharajani, Ram Rangan, and David I. August
Proceedings of the 2008 International Symposium on Code Generation and Optimization (CGO), April 2008.
Accept Rate: 31% (21/66).

Parallel-Stage Decoupled Software Pipelining [abstract] (ACM DL, PDF)
Easwaran Raman, Guilherme Ottoni, Arun Raman, Matthew Bridges, and David I. August
Proceedings of the 2008 International Symposium on Code Generation and Optimization (CGO), April 2008.
Accept Rate: 31% (21/66).

Communication Optimizations for Global Multi-Threaded Instruction Scheduling [abstract] (ACM DL, PDF)
Guilherme Ottoni and David I. August
Proceedings of the 13th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2008.
Accept Rate: 24% (31/127).

Revisiting the Sequential Programming Model for the Multicore Era [abstract] (Original Full Paper, IEEE Xplore, PDF)
Matthew J. Bridges, Neil Vachharajani, Yun Zhang, Thomas B. Jablin, and David I. August
IEEE Micro, Volume 28, Number 1, January 2008.
Accept Rate: 14% (10/70).
IEEE Micro's "Top Picks" special issue for papers "most relevant to industry and significant in contribution to the field of computer architecture" in 2007.

Revisiting the Sequential Programming Model for Multi-Core [abstract] (IEEE Xplore, PDF, Top Picks Version)
Matthew J. Bridges, Neil Vachharajani, Yun Zhang, Thomas B. Jablin, and David I. August
Proceedings of the 40th IEEE/ACM International Symposium on Microarchitecture (MICRO), December 2007.
Accept Rate: 21% (35/166).
Selected for IEEE Micro's "Top Picks" special issue for papers "most relevant to industry and significant in contribution to the field of computer architecture" in 2007.

Global Multi-Threaded Instruction Scheduling [abstract] (IEEE Xplore, PDF)
Guilherme Ottoni and David I. August
Proceedings of the 40th IEEE/ACM International Symposium on Microarchitecture (MICRO), December 2007.
Accept Rate: 21% (35/166).

Speculative Decoupled Software Pipelining [abstract] (IEEE CS, PDF)
Neil Vachharajani, Ram Rangan, Easwaran Raman, Matthew J. Bridges, Guilherme Ottoni, and David I. August
Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2007.
Accept Rate: 19% (34/175).

Pipelined Multithreading Transformations and Support Mechanisms [abstract] (PDF)
Ram Rangan
Ph.D. Thesis, Department of Computer Science, Princeton University, June 2007.

Global Multi-Threaded Instruction Scheduling: Technique and Initial Results [abstract] (CiteSeerX, PDF)
Guilherme Ottoni and David I. August
Proceedings of the Sixth Workshop on Explicitly Parallel Instruction Computer Architectures and Compiler Technology (EPIC), March 2007.

Eliminating Scope and Selection Restrictions in Compiler Optimizations [abstract] (PDF)
Spyridon Triantafyllis
Ph.D. Thesis, Department of Computer Science, Princeton University, September 2006.

Amortizing Software Queue Overhead for Pipelined Inter-Thread Communication [abstract] (PDF)
Ram Rangan and David I. August
Proceedings of the Workshop on Programming Models for Ubiquitous Parallelism (PMUP), September 2006.
Accept Rate: 41% (10/24).

Automatic Thread Extraction with Decoupled Software Pipelining [abstract] (IEEE Xplore, PDF)
Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August
Proceedings of the 38th IEEE/ACM International Symposium on Microarchitecture (MICRO), November 2005.
Accept Rate: 18% (27/143).
One of five papers nominated for the Best Paper Award by the Program Committee.

A New Approach to Thread Extraction for General-Purpose Programs [abstract] (PDF)
Guilherme Ottoni, Ram Rangan, Adam Stoler, and David I. August
Proceedings of the 2nd Watson Conference on Interaction between Architecture, Circuits, and Compilers (PAC2), September 2005.
Accept Rate: 44% (21/47).

From Sequential Programs to Concurrent Threads [abstract] (IEEE Xplore, PDF)
Guilherme Ottoni, Ram Rangan, Adam Stoler, Matthew J. Bridges, and David I. August
IEEE Computer Architecture Letters (CAL), June 2005.
Accept Rate: 20%

Decoupled Software Pipelining: A Promising Technique to Exploit Thread-Level Parallelism [abstract]
Guilherme Ottoni, Ram Rangan, Neil Vachharajani, and David I. August
Proceedings of the Fourth Workshop on Explicitly Parallel Instruction Computer Architectures and Compiler Technology (EPIC), March 2005.

Decoupled Software Pipelining with the Synchronization Array [abstract] (IEEE Xplore, PDF)
Ram Rangan, Neil Vachharajani, Manish Vachharajani, and David I. August
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2004.
Accept Rate: 18% (23/122).
Highest ranked paper in double-blind review process.