The goal of this project is to build a simple web server that can receive requests, and send back responses, to web clients.
General learning objectives:
Specific learning objectives:
This project should be done individually. The due date is listed on the course syllabus.
At a high level, a web server listens for connections on a socket (bound to a specific port on a host machine). Clients connect to this socket and use a simple text-based protocol to retrieve files from the server. For example, you might try the following command from a UNIX machine:
$ telnet www.ucsd.edu 80
GET /index.html HTTP/1.1\n
Host: www.ucsd.edu\n
\n
(type two carriage returns after the "Host" header). This will return to you (on the command line) the html representing the "front page" of the UCSD web page:
HTTP/1.1 200 OK Date: Thu, 07 Jan 2016 04:09:11 GMT Server: Apache/2 Last-Modified: Wed, 06 Jan 2016 18:00:13 GMT ETag: "6a46-528ae20302540" Accept-Ranges: bytes Content-Length: 27206 Content-Type: text/html; charset=UTF-8 <!DOCTYPE html> <html lang="en"> (rest of html web page follows...)
For this assignment, you will need to support a (pretty small) subset of the HTTP 1.1 protocol. Your server will need to be able to serve out HTML files as well as in-line images (jpg and png).
One of the key things to keep in mind in building your web server is that the server is translating relative filenames (such as index.html) to absolute filenames in a local filesystem. For example, you might decide to keep all the files for your server in ~student/cse124/server/files/, which we call the document root. When your server gets a request for index.html, it will prepend the document root to the specified file and determine if the file exists, and if the proper permissions are set on the file (typically the file has to be world readable). If the file does not exist, a file not found error is returned. If a file is present but the proper permissions are not set, a permission denied error is returned. Otherwise, an HTTP OK message is returned along with the contents of a file.
You should also note that web servers typically translate "GET /" to "GET /index.html". That is, index.html is assumed to be the filename if no explicit filename is present. That is why the two URL's "http://www.cs.ucsd.edu" and "http://www.cs.ucsd.edu/index.html" return equivalent results.
When you type a URL into a web browser, it will retrieve the contents of the file. If the file is of type text/html, it will parse the html for embedded links (such as images) and then make separate connections to the web server to retrieve the embedded files. If a web page contains 4 images, a total of five separate connections will be made to the web server to retrieve the html and the four image files. The client handles this--your web server needs to only return one response at a time.
At a high level, your web server will be structured something like the following:
Initialize:
Take a port number and document root as a command-line argument.
$ ./http-server 8080 /var/lib/www/htdocs
Setup server socket and threading:
Create a TCP server socket, and arrange so that a thread is spawned
(or thread in a thread pool is retrieved) when a new connection comes in.
Loop:
Parse HTTP/1.1 request
Ensure it is well-formed (return error otherwise)
Determine which file needs to be accessed to deliver the requested
content, and check that file's permissions and other metadata (returning
an error if needed).
Construct the response, including response headers
Transmit the contents of the file to the client
Close the connection
If you do not attempt the extra credit, make sure to include a "Connection: close" header in your request/response
You may choose from C or C++ to build your web server but you must do it in a Unix-like environment with the sockets API we've been using in class (e.g., no HTTP libraries).
Your server should support:
Your server program will only need to support this subset of HTTP/1.1.
Your server program takes two command line arguments: (1) the port to listen for incoming connections, and (2) a path to the document root containing the files that you will serve out (e.g., HTML and image files)
There are a variety of resources online that can be of help:
Your server must serve out HTML files, Jpeg and PNG image files. We are providing the following sample files to get you going, however you should be able to support other files as well.
These files are available here.
You will submit your second project to your CSE 124 GitHub account. You should include a Makefile that will build your project, producing a binary executable should be called 'http-server'. You don't have to commit your binary, just the Makefile and associated source files.
<github_id>_cse124/ |-- project |-- proj2 |-- Makefile 2 directories, 1 filesAfter calling 'make', you should have at least these files:
<github_id>_cse124/ |-- project |-- proj2 |-- http-server |-- Makefile 2 directories, 2 files