In this project, you are going to build a simple webserver that implements a subset of the HTTP/1.1 protocol specification called TritonHTTP, defined here.
Basic web server functionality
At a high level, a web server listens for connections on a socket (bound to a specific adderss and port on a host machine). Clients connect to this socket and use the TritonHTTP protocol to retrieve files from the server. For this project, your server will need to be able to serve out HTML files as well as images in jpg and png formats. You do not need to support server-side dynamic pages, Node.js, server-side CGI, etc.
Mapping relative URLs to absolute file paths
Clients make requests to files using a Uniform Resource Locator, such as
/images/cyrpto/enigma.jpg. One of the key things to keep in mind in building
your web server is that the server must translate that relative URL into an
absolute filename on the local filesystem. For example, you might decide to
keep all the files for your server in
which we call the document root. When your server gets a request for the
above-mentioned enigma.jpg file, it will prepend the document root to the
specified file to get an absolute file name of
~aturing/cse101/server/www-files/images/crypto/enigma.jpg. You need to
ensure that malformed or malicious URLs cannot “escape” your document root to
access other files. For example, if a client submits the URL
/images/../../../.ssh/id_dsa, they should not be able to download the
~aturing/.ssh/id_dsa file. If a client uses one or more
.. directories in
such a way that the server would “escape” the document root, you should return
404 Not Found error back to the client. Take a look at the
system call for help in dealing with document roots.
At a high level, your program will be structured as follows.
We will provide you with starter code that handles command-line arguments, and will call into your Python code with a port and the document root. Note that the document root and port number will be parameters that are passed into your program–do not hard code file paths or ports, as we will be testing your code against our own document root. Also do not assume that the files to serve out are in the same directory as the web server. We will call your program with either an asbolute or relative path to the document root that may or may not end in a final forward slash: e.g., “/var/home/htdocs” and/or “/var/home/htdocs/”, or “../../htdocs/”.
Setup server socket and threading
Create a TCP server socket, and arrange so that a thread is spawned (or thread in a thread pool is retrieved) when a new connection comes in. The use of multiprogramming via “fork” is OK too.
Your server binary should be called
httpd.py and should take two arguments.
The first should be the port number, and the second should be the doc-root
(given as either an absolute or relative path, with or without the trailing
$ python3 httpd.py [port] [doc_root]
$ python3 httpd.py 8080 /var/www/html
You should use Python3 to build your web server.
If you use Python, you must directly program the network with sockets calls. You cannot use 3rd party web server/HTTP libraries.
Basic functionality for 200 error code responses (50 pts)
- This category represents error-free, valid requests that result in a
200error code. Your server should correctly handle valid
GETrequests for HTML, JPEG, and PNG files.
- The response headers should be set correctly
- The response body should match the content
- You should support directories and subdirectories
- “http://server:port/" should be mapped to “http://server:port/index.html"
Basic functionality for non-200 error code responses (40 pts):
- Handles 404 for files that aren’t found
- Handles 404 for URLs that escape the doc root
- Correctly handles malformed HTTP requests by issuing a 400 error
Concurrency (10 pts):
- Your server should be able to handle concurrent clients using threads
Gradescope will run an autograder with its own htdocs directory filled with HTML, JPG, and PNG files (and subdirectories). Gradescope will only provide you with a very basic sanity check that your code compiles and runs against a simple test–it is your responsibility to ensure that your code precisely follows the TritonHTTP spec. The final autograder will include test cases not included in the version provided to you before the deadline.
Starter code (New)
To get a copy of the starter code, please use this invitation.
Submitting your work
Log into gradescope.com and upload your code. This assignment is to be done individually, or in a group of 2. If you choose to be in a group of 2, your group must be the same as in HW 3.
Friday Oct 26, 5pm
This assignment is worth 10 points
Bhargav Heeraguppe Sridharan