In this project you will implement a dataflow analysis
framework, and then you will use your framework to implement
several analyses and optimizations.
Dataflow Analysis Framework
First, you will implement an intra-procedural dataflow analysis
To run an analysis your framework will need to take
the following as input:
- The lattice
- Flow functions
- The information to use for the beginning of the function
- A representation of the program
Your framework should compute one piece of information (an
element of the lattice) for the beginning of each basic block,
and one piece of information for the beginning of each
Your framework will run a forward optimistic iterative dataflow
analysis that computes a piece of information (an element of the
lattice) at the beginning of each instruction. In particular,
your framework should initialize all information to bottom,
except for the information at the beginning of the function,
which is initialized to the values provided by the programmer
for that purpose. Then your framework should run an worklist
algorithm to find a fixed-point solution to the provided flow
Note that in class we stored one piece of information for each
edge in the CFG. There are two conceptual differences between
what we did in class, and how LLVM is setup: (1) the CFG in LLVM
is based on basic blocks and (2) in LLVM there is no explicit
representation of edges. Because of this, the best way to store
information while doing your fixed point computation is to store
one piece of information for the beginning of each basic
block. However, your framework should not only return the
information at the beginning of each basic block, but also the
information at the beginning of each instruction.
From a Software Engineering point of view, there are many ways
to design the interface that your framework will provide to
clients. This flexibility is intended: you will have to think
carefully to create an interface that is simple, clear,
easy-to-use and very well documented. If you design your
interface well to begin with, you will be able to work on the
implementation of your framework and the clients of the
framework in parallel. To guide you in your interface design,
you will be required to present your interface and discuss it
with the instructor in a short meeting on Friday May 9th.
Clients of your Dataflow Analysis Framework
You will now implement several analyses in your framework, and
several optimizations that will uses the results to perform some
||Use of Analysis Results
||Optimization: Constant Folding, Branch Folding
||Optimization: Common Sub-expression Elminiation
||Bug finder: Warn programmer if can't show array access in bounds
|Intra-Procedural Pointer Analysis
||Catalyst: Improve all other analyses using this information
Write a ~10 page report describing your design and implementation.
This is an open-ended project, so there isn't a single format we
require, but here is a (non-exhaustive) list of guidelines:
- As in Project 1, start with an overview that discusses the high
level goals of your work, along with any interesting/key findings.
- Describe your interface design. Discuss what alternative designs
you may have also considered, and explain the tradeoffs that
ultimately led to your choice.
- For each analysis, define your lattice and flow functions in
mathematical notation (i.e., in the style used in the lecture
notes and on the midterm). Then discuss how you actually went about
implementing the lattice and flow functions. For instance, some
interesting questions are: what data structure(s) did you use, and
how do you represent potentially infinite sets? How are input facts
passed to your flow functions, and how do output facts get
- Make sure to explain assumptions you make about the code you
analyze. For instance, for pointer analysis, you may have made some
assumptions about the aliasing information known about input
parameters. Explain those assumptions and why they are reasonable.
- Part of this project is to come up with a useful set of benchmarks
on which to test and improve your analysis. Discuss why you chose
those benchmarks, and what makes them interesting. If your
implementation fails on some benchmarks (there's no shame in it!),
then explain why and how the analysis might be improved.
- As this project is significantly more exploratory than the
first, we want you to tell us what you found particularly
interesting/challenging/frustrating. What extensions to the project
did you attempt? (for instance, did you try combining the results of
analyses, or did you try your hand at interprocedural analysis?).
What aspects of LLVM made it easy/hard to implement your design? If
you could redo your project with what you know now, what changes
would you make?
Finally, submit via email a
archive that contains:
- Your project report,
MEMBERS file that lists your group members in the
same format as in Project 1.
- Your benchmark suite.
- The source code for your implementation. Do not submit the entire
LLVM source, only submit the files you have added/modified. If you
have added or modified any files outside of the
llvm/src/lib/CSE231 directory we have provided, list
their locations in a file called
If your code requires special build instructions, include a file
BUILD_INSTRUCTIONS that lists those instructions.