HW 4: Writing a web server

2018/10/11

Overview

In this project, you are going to build a simple webserver that implements a subset of the HTTP/1.1 protocol specification called TritonHTTP, defined here.

Project details

Basic web server functionality

At a high level, a web server listens for connections on a socket (bound to a specific adderss and port on a host machine). Clients connect to this socket and use the TritonHTTP protocol to retrieve files from the server. For this project, your server will need to be able to serve out HTML files as well as images in jpg and png formats. You do not need to support server-side dynamic pages, Node.js, server-side CGI, etc.

Mapping relative URLs to absolute file paths

Clients make requests to files using a Uniform Resource Locator, such as /images/cyrpto/enigma.jpg. One of the key things to keep in mind in building your web server is that the server must translate that relative URL into an absolute filename on the local filesystem. For example, you might decide to keep all the files for your server in ~aturing/cse101/server/www-files/, which we call the document root. When your server gets a request for the above-mentioned enigma.jpg file, it will prepend the document root to the specified file to get an absolute file name of ~aturing/cse101/server/www-files/images/crypto/enigma.jpg. You need to ensure that malformed or malicious URLs cannot “escape” your document root to access other files. For example, if a client submits the URL /images/../../../.ssh/id_dsa, they should not be able to download the ~aturing/.ssh/id_dsa file. If a client uses one or more .. directories in such a way that the server would “escape” the document root, you should return a 404 Not Found error back to the client. Take a look at the realpath() system call for help in dealing with document roots.

Program structure

At a high level, your program will be structured as follows.

Initialize

We will provide you with starter code that handles command-line arguments, and will call into your Python code with a port and the document root. Note that the document root and port number will be parameters that are passed into your program–do not hard code file paths or ports, as we will be testing your code against our own document root. Also do not assume that the files to serve out are in the same directory as the web server. We will call your program with either an asbolute or relative path to the document root that may or may not end in a final forward slash: e.g., “/var/home/htdocs” and/or “/var/home/htdocs/”, or “../../htdocs/”.

Setup server socket and threading

Create a TCP server socket, and arrange so that a thread is spawned (or thread in a thread pool is retrieved) when a new connection comes in. The use of multiprogramming via “fork” is OK too.

Executable

Your server binary should be called httpd.py and should take two arguments. The first should be the port number, and the second should be the doc-root (given as either an absolute or relative path, with or without the trailing ‘/’):

$ python3 httpd.py [port] [doc_root]

for example:

$ python3 httpd.py 8080 /var/www/html

Implementation

You should use Python3 to build your web server.

If you use Python, you must directly program the network with sockets calls. You cannot use 3rd party web server/HTTP libraries.

Grading

Basic functionality for 200 error code responses (50 pts)

Basic functionality for non-200 error code responses (40 pts):

Concurrency (10 pts):

Autograder

Gradescope will run an autograder with its own htdocs directory filled with HTML, JPG, and PNG files (and subdirectories). Gradescope will only provide you with a very basic sanity check that your code compiles and runs against a simple test–it is your responsibility to ensure that your code precisely follows the TritonHTTP spec. The final autograder will include test cases not included in the version provided to you before the deadline.

Starter code (New)

To get a copy of the starter code, please use this invitation.

Submitting your work

Log into gradescope.com and upload your code. This assignment is to be done individually, or in a group of 2. If you choose to be in a group of 2, your group must be the same as in HW 3.

Due date/time

Friday Oct 26, 5pm

Points

This assignment is worth 10 points

Assigned TA

Bhargav Heeraguppe Sridharan