Selecting Software Phase Markers with Code Structure Analysis

Jeremy Lau, Erez Perelman, and Brad Calder

In Proceedings of the International Symposium on Code Generation and Optimization (CGO 2006).

Abstract:

Most programs are repetitive, where similar behavior can be seen at different execution times. Algorithms have been proposed that automatically group similar portions of a program's execution into phases, where samples of execution in the same phase have homogeneous behavior and similar resource requirements. In this paper, we present an automated profiling approach to identify code locations whose executions correlate with phase changes. These ``software phase markers'' can be used to easily detect phase changes across different inputs to a program without hardware support.

Our approach builds a combined hierarchical procedure call and loop graph to represent a program's execution, where each edge also tracks the max, average, and standard deviation in hierarchical execution variability on paths from that edge. We search this annotated call-loop graph for instructions in the binary that accurately identify the start of unique stable behaviors across different inputs.

We show that our phase markers can be used to accurately partition execution into units of repeating homogeneous behavior by counting execution cycles and data cache hits. We also compare the use of our software markers to prior work on guiding data cache reconfiguration using data-reuse markers. Finally, we show that the phase markers can be used to partition the program's execution at code transitions to pick accurately simulation points for SimPoint. When simulation points are defined in terms of phase markers, they can potentially be re-used across inputs, compiler optimizations, and different instruction set architectures for the same source code.