Welcome to Lab2. The goal of this lab is to use the RPC key-value pair
we built in Lab 1 to implement scalable Tribbler front-ends and
back-ends. In particular, you need to first wrap the key-value
pair server into a distributed key-value pair service called
bin storage, where your implementation should fit trib.BinStorage
interface, and then on top of this storage layer,
you need to implement a stateless Tribbler front-end type
that fits trib.Server
interface.
First we need to update the trib
repo:
$ cd ~/gopath/src/trib
$ git pull /classes/cse223b/sp14/labs/trib lab2
If you have not changed anything in trib
repo, this should be
painless. However, if you changed stuff, you need to merge the
changes.
Also update the triblab
repo to merge branch lab2
. There will be
several changes:
makefile
.lab2.go
.server_test.go
.bins_test.go
.If you have not touched those files and have not created file called
server_test.go
or bins_test.go
by yourself, the merge should be
painless:
$ cd ~/gopath/src/triblab
$ git pull /classes/cse223b/sp14/labs/triblab lab2
If you have made changes to those files. Then you need to merge the changes properly.
If you have trouble on merging the changes (on any of the repos), and don't know how to solve the conflicts, please ask the TA for help.
The overall system architecture looks like this:
As you can see from the figure, our system is split into two parts,
where the upper part is the Tribbler service, and the lower part is a
general purpose distributed key-value store called "Bin Storage". The
Tribbler front-ends take AJAX calls from the users and translate the
calls into bin storage calls, where the bin storage client will issue
the calls into the "dumb" key-value back-ends via RPC. Each back-end
here is a simple independent key-value store (which we just
implemented in the previous lab). There is also a keeper instance
running in the background that maintains the bin storage in a coherent
state. Both the bin storage clients and the keeper use the same RPC
interface to communicate with the back-ends. The interface semantics
defined in trib.Storage
should be sufficient to perform all the bin
storage tasks.
Bin Storage is a distributed key-value pair storage that combines
a bunch of trib.Storage
back-ends and presents a single and
scalable storage service over the network. Bin storage service is
provided via the trib.BinStorage
interface.
On one hand, conceptually, bin storage combines an infinite set of
separated trib.Storage
instances called bins. Each bin has a
different name, and a caller can fetch a bin via the
BinStorage.Bin()
function -- the only function in the
trib.BinStorage
interface. Initially, all bins are empty.
On the other hand, physically, a bin storage is built on top of a
finite set of key-value storage back-ends (which we built in the last
lab), where a bin (a virtual KV store) will be mapped to one of the
key-value back-ends (physical KV stores). As a result, multiple bins
might share a same back-end by splitting its key space with prefixes
(or suffixes) that encode the bin names. As shown in the figure
below, an example bin storage hashes 6 bins into 3 back-ends, storing
each bin with different prefixes inside. When a caller calls
Get("some-key")
on the bin "alice"
for example, the bin storage
client will translate that call into a Get("alice::some-key")
RPC
call on back-end 0.
In addition to key-value stores in bins, a bin storage also coarsely
synchronizes the logical clock across all the bins, so that the
clocks won't be offset too much over time. Since the back-ends are
"dumb" and they do not talk to each other, this clock synchronization
is done by a background keeper process. Note that the clocks are NOT
real-time synchronized (which would be too expensive for a distributed
system), but only coarsely synchronized. In particular it means that
if a Clock()
is issued at least 3 seconds after another Clock()
call, no matter which bins they are issued in, bin storage always
guarantees that the later one always returns a clock value no smaller
than the earlier one's. Note that by definition, a bin storage with
one single back-end will automatically satisfy this requirement, but
when there are multiple back-ends, the keeper must do some maintenance
from time to time.
As said, in Lab1, we already implemented the back-ends, the APIs for both the server and the client, so we will just reuse them without changes (assuming that you implemented them correctly). Both the bin storage client and the keeper will communicate with the "dumb" back-ends via the RPC calls we implemented, and cooperatively they present a coherent bin storage interface to upper layer applications.
Before we look into how the Tribbler service works, let's first define what a tribble is.
A Tribble is a structure type that has 4 fields:
type Trib struct {
User string // who posted this trib
Message string // the content of the trib
Time time.Time // the physical timestamp
Clock uint64 // the logical clock
}
Time
is what the front-end claims when this tribble is created, by
reading the front-end's own physical time on the machine when Post()
in a trib.Server
is called. However, to sort tribbles in a globally
consistent and reasonable order, we cannot sort the tribbles only by
this timestamp, because different front-ends always have different
physical time readings. For sorting, Tribbler service needs to
maintain a distributed logical Clock
in uint64
.
When sorting many tribbles into a single timeline, you should sort by the fields following this priority:
Clock
The logical timestamp.Time
The physical timestamp.User
The user id.Message
The message content.We call this the Tribble Order.
The Tribbler service logic is defined by the trib.Server
interface
(in trib/trib.go
file). This is how the Web user interface interacts
with a Tribbler server.
SignUp(user string) error
Creates a new user. After a user is created, it will never disappear in the system.
A valid user name must be no longer than trib.MaxUsernameLen=15
characters but not empty, must start with a lower-case letter, and can
only contain lower-case letters or numbers. There is a helper
function called trib.IsValidUsername(string)
which you can use to
check if a username is valid.
Returns error when the username is invalid or the user already exists. Concurrent sign-ups might both succeed.
ListUsers() ([]string, error)
Lists at least trib.MinListUser=20
different registered users. When
there are less than 20 users that have ever signed up, list all of
them. The returned usernames should be sorted in alphabetical order.
This is just for showing some users on the front page; this is not for listing all the users that have ever signed up, because that would be too expensive in a scalable system.
Post(who, post string, clock uint64) error
Posts a tribble. clock
is the maximum clock value this user client
has ever seen so far by reading tribbles (via Home()
and Tribs()
).
It returns error when the user does not exist or the post is too long
(longer than trib.MaxTribLen=140
).
Tribs(user string) ([]*Trib, error)
Lists the recent trib.MaxTribFetch=100
tribbles that a user posted.
Tribbles needs to be sorted in the Tribble Order. Also, it should make
sure that the order is the same order that the user posted the
tribbles.
Follow(who, whom string) error
Unfollow(who, whom string) error
IsFollowing(who, whom string) (bool, error)
Following(who string) ([]string, error)
These are functions to follow/unfollow, check following and listing
all following users of a user. A user can never follow or unfollow
himself. When calling with who
equals to whom
, the functions
return error. When the user does not exist, the functions return
error. If a user performs Folllow()
on another user over multiple
client concurrently only one Follow()
should succeed with no errors.
A user can follow at most trib.MaxFollowing=2000
users (not
including himself). When a user tries to follow more than
trib.MasFollowing=2000
users, Follow()
should return error.
Home(user string) ([]*Trib, error)
List the recent trib.MaxTribFetch=100
tribbles that are posted on
the user's following timeline in Tribble Order. In addition, the
ordering should always satisfy that:
A is posted after B means B calls Post()
after A's Post()
returned.
It returns error when the user does not exist.
In addition to normal errors, these calls might also return IO errors if the implementation needs to communicate to a remote part. Returning a nil error means that the call is successfully executed; returning a non-nil error means that the call might be successfully executed or not.
Note that trib/ref
package contains a reference implementation for
trib.Server
(which use tried in the lab setup).
Your job for Lab2 is two-folded.
ServeKeeper()
function) and the bin storage client logic (the
NewBinClient()
function).NewFront()
function).You can find these entry functions in lab2.go
file under triblab
repo:
func NewBinClient(backs []string) trib.BinStorage
This function is similar to NewClient()
in lab1.go
but instead
returns a trib.BinStorage
interface. trib.BinStorage
has only one
function called Bin()
, which takes a bin name and returns a
trib.Storage
. A bin storage provides another layer of mapping, where
the caller will first get a key-value storage (the bin) with a bin
name, and then perform key-value function calls on the returned
storage. Different bin names should logically return completely
separated key-value storage spaces, but note that multiple bins might
share a single physical storage underlying by appending the bin name
as a prefix (or a suffix) in the keys.
For your convenience, we have a new small package called trib/colon
which you can use for escaping all the colons in a string. The
escaping rule is simple, all '|'
runes are escaped into "||"
, and
all ':'
runes are escaped into "|:"
. Please use this package
wisely.
func ServeKeeper(b *trib.KeeperConfig) error
This function is a blocking function, similar to ServeBack()
. It
will spawn a keeper instance that maintains the distributed back-ends
in consistent states. For Lab2, there will be only one keeper and the
keeper do not need to do much, but in Lab3, they will be responsible
of handling all the back-end joining, leaving, crashing and related
key migrations. In Lab2, there will be only one keeper, but in Lab3,
there will be multiple keepers for fault-tolerent.
(Note: A blocking function here means that it never returns in normal
states, and whenever it does return, it always returns an error. Some
students implemented ServeBack()
with go http.Serve(...)
or
return nil
in Lab1; that was not correct. The right thing should be
doing something like return http.Serve(...)
.)
The trib.KeeperConfig
structure contains all the back-end serving
addresses and also some peering information for the keepers which you
would need in Lab3:
Backs []string
These are the addresses of the back-ends. These
are the back-ends that the keeper needs to maintain.Keepers []string
These are the addresses that the keepers might
listen on. Lab2 will have only one keeper; Lab3 will have least
three keepers.This int
The index of this keeper (in the Keepers
list). For
Lab2, it will always be zero.Id int64
A non-zero incarnation identifier for this keeper,
usually derived from system clock. For Lab2, since the keeper will
never crash, this fields does not matter.Ready
A ready signal channel. It works in a way similar to how
Ready
works in trib.BackConfig
. When a true
is received on
this channel, the distributed bin storage should be ready to serve.
Therefore, if you need to initialize the physical back-ends in some
way, make sure you do it before you send a signal over Ready
, and
don't forget to send a false
to Ready
when the initialization
fails.The keeper can do whatever it wants to do, but the keeper should do no
more than maintaining the bin storage in a consistent state. It is
okay for a keeper to understand how a bin storage client translates the keys,
but should not need to parse anything further in the keys or values.
This means that with NewBinClient()
, ServeBack()
(implemented in
Lab1) and ServeKeeper()
calls, they should together provide a
general distributed key-value pair bin storage layer, where it could
work for any kinds of service (including but not only Tribbler).
The keeper does not need to listen on the address given. The keeper address here serves more like a keeper configuration name; it tells the keeper launcher if a keeper should be running on the current machine.
func NewFront(s trib.BinStorage) trib.Server
This function takes a bin storage, and returns an implementation of
trib.Server
. The returned instance then will serve as a service
front-end that takes Tribbler service function calls, and translates
them into key-value pair bin storage calls. This front-end should be
stateless, concurrency safe, and ready to be killed at any time. This
means that at any time during its execution on any call, the back-end
key-value pair storage always needs to stay in a consistent state.
Also, note that one front-end might be taking multiple concurrent
requests from the Web, and there might be multiple front-ends talking
to the same back-end, so make sure it handles all the concurrency
issues correctly.
Also, be aware that the trib.BinStorage
parameter might be not one
that you just implemented in the previous entry functions, but as long
as it satisfies the bin storage interface specification, your Tribbler
server should work just fine. This means that you cannot rely on the
bin storage keeper to perform Tribbler related garbage cleaning. The
front-ends might spawn back-ground routines that do the garbage
cleaning by themselves.
First is to setup the bin storage. Since we might have multiple parts
running at the same time probably on different machines, we will have
a configuration file that specifies the serving addresses of the
back-ends and the keeper for the distributed bin storage. The config
file by default will use bins.rc
as its file name, where bins
stands for bin storage.
bins.rc
is saved in json format, marshalling a RC
structure type
(defined in trib/rc.go
file). We have a utility program called
bins-mkrc
that can generate a bins.rc
file automatically.
Find a directory as your working directory (like triblab
).
$ bins-mkrc -local -nback=3
This will generate a file called bins.rc
under the current
directory, and also print the file content to stdout. -local
means
that all addresses will be on localhost
. -nback=3
means there
will be in total 3 back-ends. If you remove -local
, then it will
generate back-ends starting from 172.22.14.211
to 172.22.14.220
,
which are the IP address of our lab machines. For bins-mkrc
, there
can be at most 10 backends and 10 keepers (since we only have 10 lab
machines). However, you are free to create your own bins.rc
file
that has more back-ends and keepers. For this case, we use 3 back-ends
and 1 keeper (the default value for -nkeep
).
With this configuration file generated, we can now launch the back-ends:
$ bins-back
This will read and parse the bins.rc
file, and spawn all the
back-ends which serving address is on this host. Since all the
back-ends we generate here are on localhost
, so all the 3 back-ends
are created for this case (in different go routines). You should see
three log lines showing that three back-ends started, but listening on
different ports. Besides that bins-back
reads from the configuration
file, it is not much different from the kv-serve
program. You can
also manually specify the back-ends you would like to start from the
command line. For example, you can run the following to start only the
first two back-ends:
$ bins-back 0 1
By the way, see how this program starts several independent servers at
the same time in different go routines but in the same process, and
now you should better understand why using the rpc.DefaultServer
is
not the right thing to do in Lab1.
After the back-ends are ready, we can now start the keeper.
$ bins-keeper
If should print a message log that shows that the bin storage is ready to serve.
To play with this distributed bin storage, we have another toy client
program called bins-client
:
$ bins-client
(working on bin "")
> bin a
(working on bin "a")
> get a
> set a b
true
> get a
b
> bin t
(working on bin "t")
> get a b
> bin a
(working on bin "a")
> get a b
b
...
This program reads the back-end addresses from bins.rc
and can
switch between different bins with bin
command. The default bin
is the bin with a name of an empty string.
Now with the bin storage working, we can finally launch our Tribbler front-end that uses this distributed storage:
$ trib-front -init -addr=:rand -lab
You have used this utility before. The only new thing here is the
-lab
flag, which tells it to read the bins.rc
file and use our lab
implementation. This will start a stateless front-end (which you
implemented in this lab) that will connect to the back-ends service
addresses specified in bins.rc
.
Again -init
will populate the service with some sample data.
Now you can open your browser, connect to the front-end machine and play with your own implementation.
If you want to use some other configuration file, use the -rc
flag.
It is supported in all the utilities above.
Again, when you completes this lab, it should be perfectly fine to have multiple front-ends that connects to the set of back-ends. Also both the front-ends and the back-ends should be scalable.
These are some unreal assumptions you can have for Lab2.
trib.Storage
used in the backend will return every Clock()
call in less than 1 second.trib.Storage
used in the backend, all storage IOs
on a single backend are serialized (and hence provides sequential
consistency). Each key visiting (checking if the key exist, locating
its corresponding value, or as a process of iterating all keys) will
take less than 1 millisecond. Read and write 1MB of data on the
value part (in a list or a string) will take less than 1
millisecond. Note that Keys()
and ListKeys()
might take longer
time to complete because it needs to scan over all the keys.Note that some of them won't stay true in Lab3, so try not to rely on the assumptions too much if possible.
In addition to the requirements specified by the interfaces, your implementation should also satisfy the following requirements:
While you are free to build the system in your own way, here are some suggested hints:
hash/fnv
helpful for
hashing.Follow()
or Unfollow()
.Here are some possible mistakes that a lazy and quick but incorrect implementation might do:
Home()
and
Tribs()
(you might find it very hard). Nonetheless, intuitively,
the clock argument tells the latest tribble a user have seen
(which might be 0 if the user has not seen any tribble yet), hence
the new posted tribble seems to better have a clock value that is
larger than the argument.First, make sure that you have committed every piece of your code into
the repository triblab
. Then just type make turnin
under the root
of the repository. It will generate a turnin.zip
that contains
everything in your git repository, and will then copy the zip file to
a place where only the lab instructors can read.
Last updated: Sat Apr 26 19:31:03 -0700 2014 [validate xhtml]