SimPoint 2.0
 

Home
Up
Contents

 

 

The program and scripts that perform both clustering and single SimPoint analysis on an already generated basic block vectors. This is the algorithm used to generate the simulation points used in the PACT 2003 paper.

Download SimPoint 2.0 (May 2004)

What's Provided

bullet

Performs Off-line Phase Classification from a BBV file to guide program analysis and optimization

bullet

Selects for each phase a representative Simulation Point.   These simulation points taken together accurately represent the complete execution of the program.   The simulation points can then be used to guide targeted simulation, phase analysis and optimization.

bullet

Fixes and Enhancements over SimPoint 1.1
bullet

Prevents empty clusters from being created

bullet

The random projection is created only once when running all of the different values of k for K-means, which significantly reduces the time to create simulation points.

bullet

New robust initialization for k-means called furthest-first.

bullet

Allows specification of the size of the vector file to avoid preprocessing of the vector file.

bullet

New Early SimPoint Algorithm
bullet

Gives preference for choosing Simulation Points earlier in the program's execution, but are still representative of the phase.  This is a benefit to simulation environments that rely on fast-forwarding to get to the start of each simulation point.

bullet

New Variance SimPoint Algorithm
bullet

Combines Statistical Sampling with SimPoint to provide estimated confidence and errors for a single set of simulation points for a specific architecture configuration.

bullet

The main purpose of this approach is to provide the ability to guide  Phase Classification and the choosing of Simulation Points by the similarity of Basic Block Vectors and Statistical Sampling, making sure the clusters created first satisfy the architecture independent code vectors and then the statistical confidence bounds for a given architecture to provide additional information to the clustering algorithm.

Some Additional Details on Changes:

bullet

SimPoint now uses a new, more robust, initialization of k-means (called furthest-first). The old-style initialization, called sampling, is still available via a command line option ("-initkm samp"). This new initialziation assures that the k-means centers are spread throughout the bbvector space more thoroughly at the beginning of clustering.

bullet

SimPoint now allows you to pre-specify the size of the bbfile (number of vectors, and maximum dimension) for faster loading of a basic block vector file. See the command line options -numBBVs and -numBBIDs (both are necessary).

bullet

The script bin/runsimpoint now pre-projects the input data for faster runs of SimPoint over multiple values of k. The projected data is placed in the user-specified output directory (already an argument to the script), and is deleted before the script finishes.  This prevents having to re-project the data for every k in which K-means is run.

bullet

SimPoint now does a post-processing on the clustering to prune empty clusters (or clusters smaller than some specified size). The user can specify the largest cluster size to prune via the -pruneClusterSize command line option. Clusters of this size or smaller will be removed after k-means finishes, and any points in those clusters will be re-distributed among remaining clusters. This also fixes a (somewhat harmless in SimPoint 1.1) bug where empty clusters can cause SimPoint to choose the same vector as the SimPoint for multiple (empty) clusters.
 

Requires

bullet

Generated Basic Block Vectors (BBV) for a given program/input run

 

Home ] Up ]

Send mail to calder@cs.ucsd.edu with questions or comments about this web site.
Last modified: 06/22/05