The project spec’s grading rubric has been updated.
Some notes on implementing part 2:
When a client sends a command to the leader, the leader is going to log that command in its local log, then issue a two-phase commit operation to its followers. When a majority of those followers approve of the update, the leader can commit the transaction locally, and then respond back to the client. After the leader responds back to the client, it is going to need to tell the followers that the transaction was committed. It is fine to immediately call into them with the updated commit index.
Now, what happens if a follower is in a crashed state? The leader should attempt to bring it up to date every 500ms, meaning that every half second the leader should call into the follower with updated information.
Some hints on testing part 2
To test part 2, we are (in part) going to do the following:
- Start up your servers
- Update a number of files
- We will then “crash” one or more of the followers (but never more than half)
- We’ll then continue to update files
- Your service should continue to work while the followers are crashed, so that as far as the client is concerned, nothing appears to have failed
- During the time that one or more of your followers is crashed, we’ll call into its ReadFile() api call to ensure that its state is not being updated. In other words, we’ll ensure that it is falling behind the rest of the system
- We’ll then “uncrash” the follower(s), and wait e.g., 5 seconds. Then we’ll check to make sure that those followers have “caught up” to the rest of the system and have the updated information
- This may happen multiple times.
I hope that this testing strategy can help you exercise your code.
Storing replicated logs
You do not need to write out the logs to disk–it is fine to keep them in memory. This is true for the followers and for the leader.
I’ve put together a four-part tutorial on how to get started with SurfStore, including how to implement the BlockStore service. It is located at this link
The link to accept the GitHub classrom repo is here: https://classroom.github.com/g/KNN3awwz
The starter code (for Java) is now available here: https://github.com/gmporter/cse124-project2. We’ll be aiming to post a python version soon.
Submission instructions will come a bit later.
You are going to implement a BlockStore service, one or more Metadata services, and the Client. We are going to have our own version of each of these services that we’ll use to test your project. For this reason, make sure to stick to the .proto file we provide, so that our services can inter-operate with yours. You may extend these rpc calls and the messages, and in fact, you will need to for part 2), but please don’t change the interfaces provided.
Please start early!
Project 2 is due on Friday Dec 8 at 5pm.