Lab 3: Distibuted, Multi-Party Chat
Due: 11:00am, Tuesday, April 25th, 2006
Preliminaries
I'm sure you've all used some form of Internet chat program before,
perhaps AOL, Yahoo!, or even IRC. Have you ever stopped to think
about how they are designed? In this lab, you'll be implementing a
distributed, multi-party chat server. We'll provide you with a simple
client program; you'll only need to build the server. Each server can
handle multiple clients, but, for increased scalability, there will be
multiple servers as well.
The chief difficulty with multiple servers is enforcing an ordering on
the messages in the absence of synchronized clocks. In order not to confuse the members of the chat, the
participants' messages should appear in the same order at every
client. A naive implementation would likely have clients at different
servers see messages in a different order due to the variation in
network latencies between servers. Your task for this assignment is
to implement a protocol between the servers that ensures a uniform
ordering. We'll make several assumptions about the network to make
the problem more tractable, however. Doing it for real would be
considerably more difficult.
Specification
Your server will take a set of command-line arguments including its
ID number (which is guaranteed to be unique in the system), a port
number to use for network communication, and a list of hostnames and
ports for the other servers in the network. It will then forward
messages from any clients connected to it to any other
directly-connected clients and all other servers. Similarly, it
should forward any messages received from other servers to its own
clients.
We provide a sample client program that takes three command-line
parameters: a hostname and port of a server to connect to, and a name
or handle to use during the chat. The client supports Emacs line
editing commands through the GNU readline library, so you should find
it easy to use. The client communicates with its server over a TCP
connection using a very simple protocol. Note the client does
not echo messages locally---it sends messages entered at the
client on to the server and displays them only when the server sends
them back. This allows the server(s) to control the order in which
all messages are displayed at each client. You don't need to modify
the client for this lab, although you might find it useful to automate
the client for testing purposes.
The sample server we provide simply broadcasts any messages received
from a client to all the attached clients. You need to extend the
server so that it passes the message on to all other servers. It does
not need to forward messages to its clients immediately upon receipt,
and, most likely, should buffer them for some period of time to allow
reordering. How all of this happens is entirely up to you, but we
recommend you use UDP as a transport protocol to avoid any unexpected
delays due to TCP retransmissions or timeouts.
You are responsible for the following requirements:
- Servers can join or fail at any time.
- Messages from any particular client must be displayed in the order in which they were submitted.
- Each server must forward any received messages to all its (live)
clients.
- All (currently live) servers must forward all messages from a local client
to all other (currently live) servers in addition to locally attached
clients.
- The chat should continue despite the failure of one or more
servers. Any clients connected to a dead server may not be able to
send or receive further messages, but clients connected to live
servers must continue to be able to send and receive messages. What
happens to messages previously sent (but not yet displayed) by clients
of failed servers is up to you, but any clients that do display such
messages must all display them in the same order. In particular, you
are free to display messages from clients of dead servers more than two
seconds after they were submitted.
- Your server must work both with servers running on the same
machine (on a different port) and different machines.
- Every server must display a message within two seconds of its
original submission at an initial server---assuming the original server
is still alive. Servers who join after a message is submitted need not
display the message at their clients.
- While you may edit the client for testing purposes, I will grade
your server using my own client, so you must not depend on any client
behavior other than immediate, FIFO printing of received messages.
- You cannot depend on the local server clocks to be
synchornized.
You are allowed to assume the following (non-realistic) things:
- All servers will have a globally unique identifier, specified at
startup.
- Every server will be provided with the location of all other
servers at startup. No servers other than those specified on the
command line will join the network. Specified servers may join at
arbitrary points in time, however, and, indeed, may never actually join.
- A server may die, but will never return with the same identifier
during the lifetime of the system. In other words, you can assume
fail-stop behavior.
- The maximum propagation delay between any two servers is 100 ms.
- The delay between any client and its server is negligible.
- The network never drops any packets.
Obviously, the last one is not likely to be true in any real network,
including the class virtual cluster. However, the chances of a packet
drop on the virtual cluster network are extremely small. The behavior of your
servers in the face of packet loss is undefined.
Starting off
We have provided an initial skeleton directory to get you started. It
is available as /cse223b/labs/lab3.tgz on the
course machines. You should copy this file to your working
directory. The following sequence of commands should extract the
files and build the initial (useless) executable:
% tar xzf lab3.tgz
% cd lab3
% gmake
The tarball contains three files: chat-client.C, Makefile, and
chat-server.C. chat-server.C contains some initial code to get you
started. Currently, it accepts client connections and broadcasts any
received messages to all its attached clients. It does not yet
communicate with any other servers. You can run a server with the
following command line:
% ./chat-server 1 2255 localhost 2256
Where 1 is the globally unique ID of this server, and 2225 is the port
you want it to listen on. localhost and 2256 are the hostname and
port of another server in the network. You can have as many
hostname:port pairs as you'd like. During development you're likely
to want to use something other than 2225 so it doesn't conflict with
other students working on their projects. For the sake of discussion,
start another server in a separate window with a similar command:
% ./chat-server 2 2256 localhost 2255
You can connect to a server with the provided client from another
window in the following fashion:
% ./chat-client localhost 2255 Client1
In a fourth window, you should start up another client with a slightly
different command line:
% ./chat-client localhost 2255 Client2
Any messages typed at either client should appear at both. You can
exit from the provided client by typing CTRL-D by itself at the
prompt.
If, on the other hand, you connect the second client to the other
server, with the following command,
% ./chat-client localhost 2256 Client2
you'll see that the messages are not passed between the servers. For
your assignment, you'll need to fix that up. An obvious first step is
to simply have each server send any messages from its clients to all
other servers. The hard part is making sure they're displayed in the
same order at all clients.
Testing your chat server
In order to ease testing, we've made it possible for you to drive the
client from a file. You should pass the -i flag to the client when redirecting standard
input so that it disables line editing. Here's one possible way to test
your servers:
First, create a file, say foo, such as the
one below:
This is message one.
This is message two.
This is message three.
This is message four.
This is message five.
You can then use this file to drive two clients simultaneously, like so:
./chat-client -i localhost 2255 bot1 < foo > o1 &; ./chat-client -i localhost 2255 bot2 < foo > o2 &
This will cause two clients, called 'bot1' and 'bot2', to
simultaneously connect to the server at localhost:2255 and send the
set of messages contained in foo. The
output of each will go into files o1 and
o2, respectively.
If you try this, you'll see that the order is the same. Once you can
issue the same command, but connect the clients to different
servers, and the order still comes out the same, you're well on your
way. Keep in mind I'll be testing with a lot more than two servers,
and killing some off in the middle. You might find that reading [BJ87] provides you with significant
insight into this assignment.
Collaboration policy
All code for this assignment must be written individually. You
are not allowed to look at anyone else's solutions or solutions to
similar assignments you may find for courses at UCSD or other
institutions. You may discuss the assignment with fellow students,
but all code you submit must be either yours alone or code that was
provided to you as part of the assignment.
Turnin procedure
The turnin procedure is the same as for the previous labs. When you're
ready to submit your code, you can execute the following command:
% gmake turnin
which will copy your tarball to the submission directory.