Welcome to Lab 1. The goal of this lab is to implement a key-value storage service that can be called via Remote Procedure Call (RPC). In particular you need to:
trib.Store
interface object and takes http RPC requests from the network.trib.Store
interface that relays all of its requests back to the server.More specifically, you need to implement two entry functions that are defined in the triblab/lab1.go
file: ServeBack()
and NewClient()
. Presently, they are both implemented with panic("todo")
.
As mentioned earlier, we will send out an invitation link on Piazza/Canvas. Once available, access it and follow the instructions provided in the repository to get started.
Distributed systems are by nature distributed, and to learn how to construct them you will write code that runs on more than just your local machine. We will be using Amazon Web Services (AWS) as a platform to build our distributed systems; both because it allows us to deploy globally across over 50 data centers, and because it is emblematic of most datacenters today.
To build distributed systems on AWS we first need to obtain compute
resources from it. We will be running our code in virtual machines (VMs). To
get a virtual machine on AWS follow the short tutorial supplied in your lab 1
starter code ec2-setup.md
. This tutorial will have you make an account on AWS using your UCSD credentials.
Virtual machines cost real money to run, and their pricing is determined by how
many resources are given to the virual machine. You are not going to need high
powered computation for these assignments so stick to small virtual machines
like t2.micro
which only costs a few cents a day to run. Stopping
a VM on the AWS web console will stop the VM from charging your account money.
Every student will get $50 worth of AWS credits, so if you launch a lot of VMs
for testing make sure to turn them off when you are done.
Once you have a virtual machine provisioned try to ssh into it from your
own machine. Use the public IPv4 DNS address to ssh using the hostname you
configured hostname@public-dns
. We reccomend adding entries to
your .ssh/config
file so that you can use shorthand to run
commands like ssh, and scp quickly once you are properly configured. Note that
.pem
private key will need to be specified on the command line to
authenticate yourself when logging in.
Setting up your VM is the same as setting up any other linux enviornment.
You will want to get your lab1 repository first. This will require you to
install git sudo apt install git
. Run git clone as you would
anywhere else to clone the repository. We've included an install script with
the lab1 starter code wich will install Golang and other nessisary utilities in
your VM.
Working with code on remote machines can be really tricky if you've never
done it before. There are a variety of ways to do this. One nice way is to edit
files using VScode's remote editing features. If you configure your
.ssh/config
correctly you should be able to ssh to a remote
machine in the editor and make changes to remote files as if they were local.
When deploying software to run on many virtual machines take care to ensure
that each is running the same version of your code.
The goal of Lab 1 is to wrap a key-value pair interface with RPC. You don't need to implement the key-value pair storage by yourself, but you need to use it extensively in later labs, so it will be good for you to understand the service semantics here.
The data structure and interfaces for the key-value pair service are defined in the trib/kv.go
file (in the trib
directory). The main interface is trib.Storage
, which consists of three logical parts.
First is the key-string pair part, which is its own interface.
// Key-value pair interfaces
// Default value for all keys is empty string
type KeyString interface {
// Gets a value. Empty string by default.
Get(key string, value *string) error
// Set kv.Key to kv.Value. Set succ to true when no error.
Set(kv *KeyValue, succ *bool) error
// List all the keys of non-empty pairs where the key matches the given
// pattern.
Keys(p *Pattern, list *List) error
}
Pattern
is a (prefix, suffix) tuple. It has a Match(string)
function that returns true when the string matches has the prefix and suffix of the pattern.
The second part is the key-list pair interface that handles list-valued key-value pairs.
// Key-list interfaces.
// Default value for all lists is an empty list.
// After the call, list.L should never be nil.
type KeyList interface {
// Get the list associated with 'key'.
ListGet(key string, list *List) error
// Append a string to the list. Set succ to true when no error.
ListAppend(kv *KeyValue, succ *bool) error
// Removes all elements that are equal to kv.Value in the list kv.Key.
// n is set to the number of elements removed.
ListRemove(kv *KeyValue, n *int) error
// List all the keys of non-empty lists, where the key matches
// the given pattern.
ListKeys(p *Pattern, list *List) error
}
The Storage
interface glues these two interfaces together, and also includes an auto-incrementing clock feature:
type Storage interface {
// Returns the value of an auto-incrementing clock. The return value will be
// no smaller than atLeast, and it will be strictly larger than the value
// returned last time the function was called, unless it was math.MaxUint64.
Clock(atLeast uint64, ret *uint64) error
KeyString
KeyList
}
Note that the function signatures of these methods are already RPC-friendly. You should implement the RPC interface with Go language's rpc
package. By doing this, another person's client that speaks the same protocol will be able to talk to your server as well.
Because of how the simple key-value store works, all the methods will always return nil
error when executed locally. Thus all errors you see from this interface will be communication errors. You can assume that each call (on the same key) is an atomic transaction; two concurrent writes won't give the key a weird value that came from nowhere. However, when an error occurs, the caller won't know if the transaction committed or not, because the error might have occured before or after the transaction executed on the server.
These are the two entry functions you need to implement for this Lab. This is how other people's code (and your own code in later labs) will use your code.
func ServeBack(b *trib.Back) error
This function creates an instance of a back-end server based on configuration b *trib.Back
. Structure trib.Back
is defined in the trib/config.go
file. The struct has several fields:
Addr
is the address the server should listen on, in the form of <host>:<port>
. Go uses this address in its net
package, so you should be able to use it directly on opening connections.Store
is the storage device you will use for storing data. You should not store persistent data anywhere else. Store
will never be nil.Ready
is a channel for notifying the other parts in the program that the server is ready to accept RPC calls from the network (indicated by the server sending the value true
) or if the setup failed (indicated by sending false
). Ready
might be nil, which means the caller does not care about when the server is ready.This function should be a blocking call. It does not return until it experiences an error (like the network shutting down).
Note that you don't need to (and should not) implement the key-value pair storage service yourself. You only need to wrap the given Store
with RPC, so that a remote client can access it via the network.
func NewClient(addr string) trib.Stroage
This function takes addr
in the form of <host>:<port>
, and connects to this address for an http RPC server. It returns an implementation of trib.Storage
, which will provide the interface, and forward all calls as RPCs to the server. You can assume that addr
will always be a valid TCP address.
Note that when NewClient()
is called, the server may not have started yet. While it is okay to try to connect to the server at this time, you should not report any error if your attempt fails. It might be best to wait to establish the connection until you need it to perform your first RPC function call.
Go language comes with its own net/rpc
package in the standard library, and you will use that to complete this assignment. Note that the trib.Store
interface is already in "RPC friendly" form.
Your RPC needs to use one of the encodings, listen on the given address, and serve as an http RPC server. The server needs to register the back-end key-value pair object under the name Storage
.
Both the trib
and triblab
directorys comes with a makefile with some handy command line shorthands, and also some basic testing code.
Under the trib
directory, if you type make test
, you should see that the tests run and all tests passed.
Under the triblab
directory, if you type make test-lab1
, you will see the tests fail with a "todo panic" if you have not completed Lab 1 yet.
For Lab 1, we have also provided a comprehensive test-suite under tribgrd
directory, which you can run with make run
under that directory. If you pass these tests, you will get full credit for Lab 1 (assuming you're not cheating somehow).
However, this will not be the case of later labs and the tests that come with the repository are fairly basic and simple. Though you're not required to, you should consider writing more test cases to make sure your implementation matches the specification.
For more information on writing test cases in Go, please read the testing package documentation.
While you are free to do the project in your own way as long as it fits the specification, matches the interfaces, and passes the tests, here are some suggested first steps.
First, create a client.go
file under the triblab
repo, and declare a new struct called client
:
package triblab
type client struct {
// your private fields will go here
}
Then add method functions to this new client
type so that it matches the trib.Storage
interface. For example, for the Get()
function:
func (self *client) Get(key string, value *string) error {
panic("todo")
}
After you've added all of the functions, you can add a line to force the compiler to check if all of the functions in the interface have been implemented:
var _ trib.Storage = new(client)
This creates a zero-filled client
and assigns it to an anonymous variable of type trig.Storage
. Your code will thus only compile when your client satisfies the interface. (Since this zero-filled variable is anonymous and nobody can access it, it will be removed as dead code by the compiler's optimizer and hence has no negative effect on the run-time execution.)
Next, add a field into client
called addr
, which will save the server address. Now client
looks like this:
type client struct {
addr string
}
Now that we have a client type that satisfies trib.Storage
, we can return this type in our entry function NewClient()
. Remove the panic("todo")
line in NewClient()
, and replace it by returning a new client
object. Now the NewClient()
function should look something like this:
func NewClient(addr string) trib.Storage {
return &client{addr: addr}
}
Now all you need to do for the client half is to fill in the code skeleton with the correct RPC logic.
To do an RPC call, we need to import the rpc
package, so at the start of the client.go
file, let's import rpc
after the package name statement.
import (
"net/rpc"
)
The examples in the rpc
package show how to write the basic RPC client logic. Following their example, you might create a Get()
method that looks something like this:
func (self *client) Get(key string, value *string) error {
// connect to the server
conn, e := rpc.DialHTTP("tcp", self.addr)
if e != nil {
return e
}
// perform the call
e = conn.Call("Storage.Get", key, value)
if e != nil {
conn.Close()
return e
}
// close the connection
return conn.Close()
}
However, if you do it this way, you will open a new HTTP connection for every RPC call. This approach is acceptable but obviously not the most efficient way available to you. We leave it to you to figure out how to maintain a persistent RPC connection, if it's something you want to tackle.
Once you've completed the client side, you also need to wrap the server side in the ServeBack()
function using the same rpc
library. This should be pretty straight-forward if you follow the example server in the RPC documentation. You do this by creating an RPC server, registering the Store
member field in the b *trib.Config
parameter under the name Storage
, and create and start an HTTP server. Just remember that you need to register as Storage
and also need to send a true
over the Ready
channel when the service is ready (when Ready
is not nil
), and send a false
when you encounter any error on starting your service.
When all of these changes are done, you should pass the test cases written in the back_test.go
file. It calls the CheckStorage()
function defined in the trib/tribtest
package, and performs some basic checks to see if an RPC client and a server (that runs on the same host) will satisfy the specification of a key-value pair service (as a local trib/store.Storage
does without RPC).
To do some simple testing with your own implementation, you can use the kv-client
and kv-server
command line utilities.
First make sure your code compiles.
Then run the server.
$ kv-server
(You might need to add $GOPATH/bin
to your $PATH
to run this.)
You should see an address print out (e.g. localhost:12086
). By default, the server will choose an address of the form localhost:rand
. If desired, you can override this setting with a command line flag.
Now you can play with your server via the kv-client
program. For example:
$ kv-client localhost:12086 get hello
$ kv-client localhost:12086 set foo value
true
$ kv-client localhost:12086 get foo
value
$ kv-client localhost:12086 keys fo
foo
$ kv-client localhost:12086 list-get hello
$ kv-client localhost:12086 list-get foo
$ kv-client localhost:12086 list-append foo something
true
$ kv-client localhost:12086 list-get foo
something
$ kv-client localhost:12086 clock
0
$ kv-client localhost:12086 clock
1
$ kv-client localhost:12086 clock
2
$ kv-client localhost:12086 clock 200
200
Instructions for turning in the assignment are provided in the lab repository.
Last updated: 2021-05-18 07:43:47 -0700 [validate xhtml]