Completed Technical Report with Measurements
Due Date: May 10th, 11:59PM
For the third checkpoint, you’ll test the performance of your full Raft implementation in a variety of ways, and then write up a technical report containing graphs of your measurements and explanations of your observations. How you gather and compose your findings is up to you, though you should make sure that you can reproduce any conclusions described in your writeup. You should end up with a writeup that is about 6 pages.
As the technical report is the primary method we’re using to evaluate your implementation, please be sure to clearly describe any implementation choices, technical limitations, or other necessary information that impacts your system’s performance. We do not expect you to have an absolutely perfect implementation, but we do expect that you can accurately describe your system and justify the outcomes of your experiments.
The technical report isn’t about justifying the design of your version of Raft. We’re looking for a report that takes the understanding you have about your design into a set of experiments in order to create a document describing how and why your system performs the way it does. It’s very likely that you will notice some trends in your experiments that leads you to a new idea on how you could improve or change your system. We encourage you to write about these findings in the report. But because you only have a week for the report, we’d rather you spend more time testing and describing your system than iterating on new code that could improve your results.
The first part of your report should cover some basic information about your specific implementation and any qualifying design decisions you made that could affect the results of your experiments. We’ve provided some basic questions that you should answer below. Think about not only about the answers to these questions, but answers to other questions that we or someone else reading your report might ask.
- What language did you use for your implementation?
- What AWS VMs did you test on? How many processes/servers were running on each VM?
- Did you run your client tests on an AWS VM, or locally?
- How many concurrent operations can your system support?
- How many server processes are running concurrently?
- What modifications did you make to Raft, if any? Describe any specific implementation details that could affect your system’s performance.
- What are the “default” configuration values for your experiments? This includes things like key/value sizes, RPC timeouts, etc.
- How do you define/run your test cases?
- How do you measure performance and gather results for each experiment?
- How rapidly can/did you change the connection matrix in your servers?
The next, and most significant part of your report should be going through each experiment you performed, presenting the result of that experiment, and describing any conclusions you draw from those results.
When performing and writing about an experiment, make sure that you think about some of the questions listed above. If an experiment requires changing more than one variable, make sure you clearly define what is being changed, and why. Depending on your implementation, you may want to consider answering some of the questions listed above before each experiment in order to provide understanding to the reader and to support your conclusions.
Please make sure that you’re gathering results in a way that can be reproduced. You should be fully prepared for us to ask you to replicate a result if we find it strange/interesting. It is common for the result of an experiment to beget additional investigation or inquery, so you should be capable of replicating data or slightly changing a test in order to supply further information.
When presenting results for a test, you should almost always have at least one independent variable that will be presented along the x-axis of a graph, and the dependent result along the y-axis. Please avoid presenting just a single data point- it doesn’t really provide any real information or insight into your system.
We don’t care how you create graphs of your data, but using
matplotlib or Matlab are good options that we’d recommend. If you need help graphing your data, please let us know- we don’t want you to spend large amounts of time figuring out how to graph your data, so we just want each group to have a solution that works for them.
Here’s a few tests that we expect every group to perform. This is not an exhaustive list, however; you should look at the Raft paper and determine some additional tests to run on top of these, and think of some tests of your own that you think would be interesting.
- How many client operations can you perform per second?
- What is the latency of each operation?
- How does the number of running servers affect this?
- How do different key/value sizes affect this?
- How do node failures/disconnections affect this?
- How long does it take to reconnect after cutting off one node with the chaos monkey?
- How long does it take to reconnect after partitioning the network with the chaos monkey?
The most important aspect of the technical report is drawing conclusions about your system based on the observations of your results. This is where it becomes important to combine your understanding of your system along with an expected hypothesis for each test in order to derive interesting commentary on why your system behaves the way it does.
You may notice that design decisions you made in your system cause it to perform subideally in certain scenarios. This is okay! The point of the report is to describe how and why your system performs in these cases. Just convince the reader (us!) that this is how your system actually performs.