|




| |
The program and scripts that perform both clustering and
single SimPoint analysis on an already generated basic block vectors. This is
the algorithm used to generate the simulation points used in the
PACT
2003 paper.
Download SimPoint 2.0 (May 2004)
What's Provided
 |
Performs Off-line Phase Classification from a BBV file to guide program
analysis and optimization |
 |
Selects for each phase a representative Simulation Point.
These simulation points taken together accurately represent the complete
execution of the program. The simulation points can then be used
to guide targeted simulation, phase analysis and optimization. |
 |
Fixes and Enhancements over
SimPoint 1.1
 |
Prevents empty clusters
from being created |
 |
The random projection is
created only once when running all of the different values of k
for K-means, which significantly reduces the time to create
simulation points. |
 |
New robust initialization
for k-means called furthest-first. |
 |
Allows specification of
the size of the vector file to avoid preprocessing of the vector
file. |
|
 |
New Early SimPoint Algorithm
 |
Gives preference for choosing Simulation Points earlier in the program's
execution, but are still representative of the phase. This is a
benefit to simulation environments that rely on fast-forwarding to get to
the start of each simulation point. |
|
 |
New Variance SimPoint Algorithm
 |
Combines Statistical Sampling with SimPoint to provide estimated
confidence and errors for a single set of simulation points for a specific
architecture configuration. |
 |
The main purpose of this approach is
to provide the ability to guide Phase Classification and the
choosing of Simulation Points by the similarity of Basic Block Vectors
and Statistical Sampling, making sure the clusters created first
satisfy the architecture independent code vectors and then the statistical
confidence bounds for a given architecture to provide additional information
to the clustering algorithm. |
|
Some Additional Details on Changes:
 |
SimPoint now uses a new, more robust,
initialization of k-means (called furthest-first). The old-style
initialization, called sampling, is still available via a command line
option ("-initkm samp"). This new initialziation assures that the k-means
centers are spread throughout the bbvector space more thoroughly at the
beginning of clustering. |
 |
SimPoint now allows you to pre-specify
the size of the bbfile (number of vectors, and maximum dimension) for faster
loading of a basic block vector file. See the command line options -numBBVs
and -numBBIDs (both are necessary). |
 |
The script bin/runsimpoint now
pre-projects the input data for faster runs of SimPoint over multiple values
of k. The projected data is placed in the user-specified output directory
(already an argument to the script), and is deleted before the script
finishes. This prevents having to re-project the data for every k in
which K-means is run. |
 |
SimPoint now does a post-processing on
the clustering to prune empty clusters (or clusters smaller than some
specified size). The user can specify the largest cluster size to prune via
the -pruneClusterSize command line option. Clusters of this size or smaller
will be removed after k-means finishes, and any points in those clusters
will be re-distributed among remaining clusters. This also fixes a (somewhat
harmless in SimPoint 1.1) bug where empty clusters can cause SimPoint to
choose the same vector as the SimPoint for multiple (empty) clusters.
|
Requires
 |
Generated Basic Block Vectors (BBV) for a given program/input run |
| |
|