UCSD Main WebsiteUCSD Jacobs SchoolDepartment of Computer Science and Engineering
spacer gif
spacer gif
CSE 223B
spacer gifspacer gif
spacer gif
spacer gifCourse Overview
spacer gifspacer gifStructure
spacer gifspacer gifGrading
spacer gifspacer gifUseful books
spacer gif
spacer gifspacer gifSchedule
spacer gif
spacer gifspacer gifReadings
spacer gif
spacer gifLabs
spacer gif   Lab 1
spacer gif   Lab 2
spacer gif   Lab 3
spacer gif   Lab 4
spacer gif
spacer gif
spacer gif
Search
spacer gifspacer gifspacer gif
Advanced search  arrows gif

spacer gif
spacer gifspacer gif
spacer gif
spacer gif
spacer gif
spacer gif
spacer gifspacer gifspacer gif
spacer gif Lab 4: A Simple Distibuted File System
Due: 11:59pm (midnight), Sunday, May 2nd, 2004

Preliminaries

Distibuted network file systems are commonplace today; we'll be reading about three such systems, Harp [LGG+91], Frangipani [TML97], and Cedar [Hag87], in class, but there are many others, like Coda, Zebra, Locus, Ficus, and Deceit. Unlike traditional network file systems like NFS [SGK+95], in which all clients connect to the same server, a distributed file system spreads a single disk image across multiple servers, allowing clients to connect to any server. As you can imagine, careful placement of the disk image data across the server group may lead to increased performance, availability, resilience to failures, or some combination of all three. Unfortunately, this comes at a cost---consistency must be maintained across the servers so that all clients see a similar disk image.

When you think about it, this synchronization is not unlike that required for our chat server in the previous lab. In fact, if we chose to replicate the disk image at each server, the requirements are almost exactly the same: updates from any client must be broadcast to every server, and all updates must be ordered in the same fashion at each server. We'll stick to this simple model for now; you might find it interesting to consider alternate data placement algorithms as part of your term project.

Implementing a file server is tricky business. In order to allow you to focus on the interesting distributed systems issues, and not UNIX file system peculiarities, you'll be provided with a working user-level server based on the Self-certifying File System (SFS). This will allow you to mount your file system on any machine with an SFS client, including all of the Active Web FreeBSD machines. Servers using the SFS framework need only implement the NFSv3 RPCs; everything else is taken care of for you. You won't need to know much about SFS or the loopback NFS toolkit it's based on, but those of you interested in reading more about it should check out the original SFS paper and the user-level file system paper.

Specification

Your server will read in a configuration file which specifies the set of peer servers. It will also be given a directory which it can use to store the files it's sharing---you're welcome to use some in-memory data structure if you'd like, but I think you'll find it easiest to simply keep a mirror of your distributed file system on the local file system. You need to make sure this directory has identical content at each server (the simplest way to do that initially is to make sure it's empty at startup).

We provide a sample server that will export the contents of the scratch directory, but does not communicate with any other servers. Hence, the versions of the file system exported at each server will diverge as clients make modifications. Your task is to ensure that the file systems stay consistent by incorporating any client's changes at all servers by implementing a communication protocol between the servers. Basically, you'll need to modify various NFS RPC functions to broadcast the requests to all servers before completing. Hopefully, you'll find the protocol you implemented for Lab 3 is at least a partial solution.

You are responsible for the following requirements:

  • You must implement strong consistency for all updates at the same server. Namely, writes at any client must be observed by a subsequent reads at all other clients of that server. Note this only refers to requests actually dispatched to the server; client writes are cached locally and only written-through periodically or at a close. The server can obviously only affect the consistency of operations that are not served entirely by the client's local cache.
  • Clients must see updates at other servers if the updates occurred more than two seconds in the past.
  • The file system should continue to be available despite the failure of one or more servers. Any clients connected to a dead server may not be able to access the file system, but clients connected to live servers must continue to be able to use the file system and see any changes from other live clients. Open files (and those closed for less than two seconds) at a failed server may become inconsistent across servers.
  • Your server must work both with servers running on the same machine (with different scratch directories) and different machines.
You are allowed to assume the following (non-realistic) things:
  • Every server will be provided with the location of all other servers at startup. No servers other than those specified on the command line will join the network.
  • All servers will be on-line before any client requests arrive.
  • A server may die, but will never return during the lifetime of the system. In other words, you can assume fail-stop behavior.
  • The maximum propagation delay between any two servers is 100 ms.
  • The delay between any client and its server is negligible.
  • The network never drops any packets.
Obviously, the last one is not likely to be true in any real network, including the Active web cluster. However, the chances of a packet drop on the Active web LAN are extremely small. The behavior of your servers in the face of packet loss is undefined.

Starting off

We have provided an initial skeleton directory to get you started. It is available as /home/classes/sp04/cse223b/labs/lab4.tgz on the Active web machines. You should copy this file to your working directory. Unlike the previous labs, this code will only build and run on the Active Web cluster. The following sequence of commands should extract the files and build the initial (non-distributed) executable, sfsusrv:
% tar xzf lab4.tgz
% cd lab4
% gmake
There are several files in the tarball. The ones you'll be interested in are client.C and peer.C. client.C implements each of the NFS RPCs and peer.C implements the UDP receive callback (which you'll need to extend to handle your peer communication protocol). sfsusrv is implemented using the libasync library, which provides all sorts of asynchronous I/O and event handling routines. You shouldn't need to know how it works to use it, but if you're curious, a tutorial is available on line. Before you run sfsusrv, you need to generate a public/private key pair for use by the server. Run the following command:
% /home/classes/sp04/cse223b/sfs/build/bin/sfskey gen -KP sfs_host_key
Creating new key: sfs_host_key#1 (Rabin)
       Key Label: sfs_host_key Press return
Note: seeing the following warning messages is normal when using sfskey:
/var/sfs/sockets/agent.sock: No such file or directory
sfskey: sfscd not running, limiting sources of entropy
You can pass several command line arguments to sfsusrv. The two you'll be interested in are:
  • -p port Listen on the specified port for connections. If port is zero, sfsusrv will bind to an unused port, and tell you what port it got. You probably want to use the zero option to avoid port conflicts with other students.
  • -f config-file Read additional options from the specified configuration file. The important options in the config file are the directory to export, the location of the server's public key (which you created using sfskey above), the port to use for communicating with peer servers, and the location of the peer servers. Here's an example (included in sample_config):
    export /tmp/export-snoeren
    keyfile sfs_host_key
    peerport 7777
    peer trowel.ucsd.edu 7777
    
You should edit the configuration file as appropriate. The first thing to do would be to change the export line to point to a directory you've created. You can then start up sfsusrv with the following command line:
% ./sfsusrv -f ./sample_config
sfsusrv: peer: trowel.ucsd.edu:7777
sfsusrv: version 0.7.2, pid 69766
sfsusrv: Listening for peers on port: 7777
sfsusrv: Now exporting directory: /tmp/export-snoeren
sfsusrv: serving /sfs/@trowel.ucsd.edu%2022,nd4uufan993fj8xdew6658rhdgugidsz
The last line shows the path name that you can use to access your server from an SFS client on any machine except the one running your server. (For technical reasons, you cannot mount a file system on the a machine that is serving it.) Your path will be different, as it depends both on the sfs_host_key you generated and the hostname and port the server is running on. All of the Active Web machines run an SFS client, so you could see the contents of your file system by typing the following on any other Active web machine:
% ls /sfs/@trowel.ucsd.edu%2022,nd4uufan993fj8xdew6658rhdgugidsz
This should list the contents of your filesystem (the directory you specified in the sfsusrv configuration file). You can use this directory just like any other filesystem. Note that our sample code does not communicate any changes you make to peer servers.

Understanding NFS RPCs

Once you've gotten sfsusrv running, you can use it to trace NFS RPCs. Kill your existing sfsusrv with control-C, and start a new one with the ASRV_TRACE environment variable set to 10:
% env ASRV_TRACE=10 ./sfsusrv -f ./sample_config

Now, on another machine, browse the exported file system. You'll have to use the /sfs/... pathname printed out by the current instance of sfsusrv, since the port number will change each time. As you browse, the server will print out a complete trace of all NFS requests it receives. (Large structures may be truncated; if this is ever a problem, try higher values than 10.) You can capture the output of this trace by piping the output of sfsusrv to a file. For example, in the Bourne shell (bash), you could use the following command line:

% env ASRV_TRACE=10 ./sfsusrv -f ./sample_config 2>&1 | tee nfs.trace

After setting up sfsusrv to trace NFS traffic, run the following commands (substituting the correct self-certifying pathname):

% cd /sfs/...
% rm junk
rm: junk: No such file or directory
% echo hello > junk
% cat junk
hello
% cat junk
hello
Now stop sfsusrv, and look at the RPCs in the trace file file. You can see a summary of the RPCs using grep serve nfs.trace, though in order to see arguments and return values you'll have to look at the whole file. Answering the following questions will help you understand how NFS and sfsusrv interact.
  • Which RPCs correspond to the creation of junk?
  • Which to the first cat?
  • Which to the second cat?
  • Explain any differences between the RPCs caused by the two cat commands.

Collaboration policy

All code for this assignment must be written individually. You are not allowed to look at anyone else's solutions or solutions to similar assignments you may find for courses at UCSD or other institutions. You may discuss the assignment with fellow students, but all code you submit must be either yours alone or code that was provided to you as part of the assignment.

Turnin procedure

The turnin procedure is the same as for the previous labs. When you're ready to submit your code, you can execute the following command:
% gmake turnin
which will create a file called lab4-turnin.tgz. You should copy the file into your ~/223bturnin/ directory. I will consider the first file in the 223bturnin directory after the deadline to be your submission.

spacer gif

spacer gif
spacer gifback to top ^
spacer gif
spacer gif
spacer gif
spacer gif
9500 Gilman Drive, La Jolla, CA 92093-0114
spacer gif
About CSE | CSE People | Faculty & Research | Graduate Education | Undergraduate Education
Department Administration | Contact CSE | Help | Search | Site map | Home
snoeren@cs.ucsd.edu
Official web page of the University of California, San Diego
Copyright © 2002 Regents of the University of California. All rights reserved.
spacer gif