Errata

  • 2/19: Updated the arguments of the metadata server script to include an ‘id’ field

  • 2/22: Java starter code is available at this link
  • 2/22: Python starter code is available at this link

  • 2/24: More specific information on the semantics of delete and update.

  • 2/28: Updated the description of Test 5

  • 3/1: Added a subsection on version numbers. For project 2 part 1, you can assume the version number is always 1. For HW7/8, the version number might be greater than 1.

  • 3/1: You can log whatever you want to stderr, but keep stdout with text indicated in the write-up

  • 3/1: If you get a No handlers could be found for logger... message, try to import logging and then logging.basicConfig().

  • 3/1: If you choose to compile in C++, you may find this compile command helpful:
    • g++ -D_GLIBCXX_USE_CXX11_ABI=0 -DHAVE_INTTYPES_H -DHAVE_NETINET_IN_H -Wall -o (output name here) -I./usr/include -I/usr/include/thrift (YOUR CLIENT).cpp gen-cpp/*.cpp -lthrift
  • 3/2: Clarified that we’re using your client with your server

Overview

In this project you are going to create a distributed file storage system similar to Dropbox called TritonTransfer. TritonTransfer consists of two services: a Metadata service and a BlockStore service, as well as a Client command-line program that is used to upload and download files. Files are broken into blocks, which are stored in the BlockStore service. Finally, all information about the mapping of file names to blocks (and thus to the block server) is kept in the metadata service.

The project is structured into two parts:

  • Part 1: The metadata service is implemented as a single server that keeps all data in memory (no fault tolerance)
  • Part 2: The metadata service is implemented across a set of distributed servers that keep the data consistent despite network or server failures (fault tolerance).

Learning objectives

The goal of this project is to create a file transfer system using Apache Thrift that enables a user to upload and download files. During this project you will:

  • Implement a protocol for block file transfer, based on Dropbox
  • Develop a distributed service using an RPC framework
  • Implement fault tolerance via read and write quorums (Part 2)
  • Develop a testing strategy for implementing a fail safe cluster with multiple servers (Part 2)
  • Using Git and Github for source code management

Logistics

  • This project should be done individually.
  • The project is due on the last day of class, as shown on the course schedule.
  • There is no longer a checkpoint for part 1, however you should try to complete this by March 8th so you’ll have time to complete the homework extending the metadata service to support replication and fault tolerance.
  • You will need to submit the project through the same GitHub account you used for HW1 (URL to be provided soon).
  • The GitHub invitation and starter code is available at https://classroom.github.com/assignment-invitations/3a73065ef08d83063364e83e4ad6960c

TritonTransfer Specification

In this project you are going to implement a subset of the Dropbox protocol. For part 1, you will create a single in-memory metadata server, a single block server, and a client for the file storage system.

Basic concepts

Blocks

A file in TritonTransfer is broken into multiple blocks. Each block is of uniform size (4MB) except for the last block in the file, which may be smaller. As an example, consider:

File Block Example

The file ‘video.avi’ is 14 MB, and the block size is 4MB. The file is broken into blocks b1, b2, b3, and b4 (which is only 2MB). For each block, a hash value is generated using the SHA-256 hash. So for video.avi, those hashes will be [h1, h2, h3, h4] in the same order as the blocks. This set of hash values in order represents the file, and is referred to as the blocklist. Note that if you are given a block, you can compute the hash by applying the SHA-256 hash function to the block. This also means that if you change data in a block the hash value will change as a result.

Client

The client is a command-line tool that the user uses to upload and download files from TritonTransfer. Further, the client can delete files from the data store, and it can also update files by overwriting a newer version of the file. The version number of a file is simply the timestamp of that file. You can assume that timestamps are valid, and that all servers in the system are using synchronized clocks (you do NOT need to handle clock skew).

Block server

The block server is an in-memory data store that stores blocks of data, indexed by the hash value. Thus it is a key-value store. It supports a basic get(), put(), and delete() interface to get a block, add a new block, or remove a block. The block server only knows about blocks–it doesn’t know anything about how blocks relate to files.

Metadata server

The metadata server maintains the mapping of filenames to blocklists. All metadata is stored in memory, and no database systems or files will be used to maintain the data. Data stored in the metadata server need not persist between invocations of the server, meaning that when you start up an instance of TritonTransfer, the store is “empty”.

Basic operating theory

When the client wants to upload a file from the user’s computer to the TritonTransfer service, it first reads that file, and splits it into blocks, as described above. It then computes the hash values of each of the blocks to form a blocklist. It then contacts the Metadata server and invokes the storeFile() api, passing it the filename and the blocklist (i.e., the list of hash values).

The metadata server is responsible for storing the mapping of filenames to blocklists, and the BlockServer is responsible for storing the actual blocks themselves. So there is something of a race condition that might develop when a client wants to store a new file in TritonTransfer. In particular, to store a file in TritonTransfer, the client must call storeFile() in the metadata server, AND then it must store each of the data blocks in the BlockServer before the file is actually in the service. What might happen if the metadata server is updated (after a call to storeFile()), but then another client tries to download the file before all the blocks are stored in the BlockServer?

To prevent this race condition, the protocol we’re using works as follows. When the client does a storeFile() operation, the metadata server is going to query the BlockServer for each of the hash values in the blocklist, to see what blocks the BlockServer is storing (and which blocks it is not). If any blocks are missing from the BlockServer, the metadata server will reply back to the client with a list of missing blocks. The metadata server will not create the filename to blocklist mapping if any blocks are not present in the BlockServer. Only when all the blocks are in the BlockServer will the metadata server signal a success return value to the client’s storeFile() operation, and from then on the file is available to any clients that want to download it.

As an example:

File Upload Example

To download a file, the client invokes the getFile() api call on the metadata server, passing in the filename. The metadata server simply returns the blocklist to the client. The client then downloads the blocks from the BlockServer to form the complete file. As an example:

File Download Example

Note that an optimization that the client must perform is to avoid copying unneeded blocks from the BlockServer. Many files in DropBox have regions that are exactly the same across different files. For example, two video files may have overlapping regions that are exactly the same, or a user might have two copies of the same mp3 song file with different names. The client must make sure to only transfer blocks that it doesn’t already have from the BlockServer. To implement this functionality, you are going to have to have the client scan all the files in the base directory when it starts up to build the list of blocks that the file has locally, before uploading or downloading files from the service.

In particular, if you download two files with identical contents (but different names), the client should only download the blocks of the file one time each, not two times each.

Client <–> Metadata server protocol

In the skeleton code that we’ve provided you, we have defined the Client to metadata server protocol. We will be testing your server with your client. You must follow the API specified in the metadataServer.thrift file (and relevant portions of shared.thrift).

Client <–> BlockServer operations

In the skeleton code that we’ve provided you, we have defined the Client to BlockServer protocol. We did this so that we can test your system with our client. You must follow the API specified in the blockserver.thrift file (and relevant portions of shared.thrift).

Metadata server <–> BlockServer operations

You will need to define your own protocol between the metadata server and the blockserver. At a high level, your protocol will need to permit the metadata server to query the existance of a block in the BlockServer’s data store. It is up to you for how to implement this.

Version number support

In “part 2” of this project (HW 7/8), files will be associated with version numbers. Behavior of the system with version numbers is provided in that write-up. For project 2 (part 1), you can assume that the version number will always be 1.

Implementation details

Configuration file

For this project, you will create a Configuration file describing the cluster details, as follows:

config.txt

    M: 1
    metadata1: <port>
    block: <port>
  • The initial line M defines the number of Metadata servers. For part 1 this value will always be 1.

  • The ‘metadata1’ line specifies the port number of your metadata server. Note the ‘1’ after the word metadata in the key.

  • ‘Block’ denotes the port number of your BlockServer.

This config file will be available to the client and both servers when they are started. This configuration file helps the server or client to know the cluster information and also how many metadata servers are present in the service (for part 2). Note that you’re going to run the client, the BlockServer, and the Metadata server all on the same machine, just on different ports.

Client

When a client boots up, it will be provided a base directory.

  • The base directory may be empty or have some files in it. The client should scan the directory for files when it starts and process the files into blocks to store the list of blocks already present at the client.

  • Keeping a list of locally present blocks is important, since when downloading a new file from server, only the blocks that are not locally present need be downloaded

  • Client will print “ERROR” on the console, if any error happens during the process, for example if it cannot connect to the metadata server or the block server.

  • Your client can be written in any language supported by Thrift (and available on the ieng6 machines). To provide a uniform interface, please wrap your client in a shell script:

    ./runClient.sh <config_file> <base_dir> <command> <filename>

where

  • runClient.sh: A shell script that will invoke your client executable
  • config_file: The full filename of the config file
  • base_dir: Path to keep the downloaded files in (and from where you’ll upload files). Note that you don’t have to handle local or remote subdirectories–everying goes in a single directory.
  • command: Either “download”, “upload”, or “delete”
  • filename: The name of the file to be downloaded, uploaded, or deleted

runClient.sh should be present in your repo. It is just a wrapper to invoke your client for the ease of grading. Since you may code in a different language compared to your peers, having the shell script will allow the grading tool to run your client language agnostically.

Metadata server

The metadata server should be started by using a shell script called runMetaServer.sh

    ./runMetaServer.sh <config_file> <id>

where

  • config_file: The configuration file
  • id: The ID of the metadata server to start. For part 1, this will be ‘1’

By reading the configuration file, the metadata server will know what port it should listen on, and also where the BlockServer is located.

Block server

The block server should be started by using a shell script called runBlockServer.sh

    ./runBlockServer.sh <config_file>

where

config_file: The configuration file.

By reading the configuration file, the BlockServer will know what port it should listen on.

Build script

There will be a file called build.sh that when called, will build your code on ieng6. It is critical that your code work and be tested on ieng6, as we’ll be using that server to test your code. If you use Python, it is OK if build.sh simply does nothing. If you ues java, build.sh should call javac or ant, etc. You can have build.sh call ‘make’ if you’re using C or C++.

Update and delete (Updated 2/24)

If a client wants to update a file, it simply calls uploadFile() with a newer version number (which is simply the modification time of the file in seconds). As with the upload of a new file, the client should only transfer blocks to the BlockServer that aren’t already on the Block Server.

To delete a file, the client invokes a delete operation on the metadata server. The metadata server will remove the mapping of the filename to the blocklist to carry out the deletion. It is NOT necessary for the client to “garbage collect” unused blocks off the BlockServer. In other words, only the metadata server’s state need be updated.

Runtime

You will be using Apache Thrift for the project. The baseline functions needed to be implemented by your server and client will be given via a thrift file. You will write your client, metadata server and block server program using the bindings created by the thrift compiler.

You must follow the client to metadata server API and client to blockserver APIs given in the skeleton code. You are free to implement your own metadata to BlockServer API.

A good place to start would be to write some sample code in thrift and see how the thrift compiler generates the bindings and how to write your own client and server code using the same.

Here are a few good places to start

  • https://diwakergupta.github.io/thrift-missing-guide/
  • http://thrift.apache.org/tutorial/
  • Download the thrift source code from http://www.apache.org/dyn/closer.cgi?path=/thrift/0.10.0/thrift-0.10.0.tar.gz, unzip it and look at the tutorials folder. You will find many examples given inside.

For the curious, if you would like to know the internals look at the paper published by Facebook at http://thrift.apache.org/static/files/thrift-20070401.pdf

Grading and testing

We are going to run your metadata server and BlockServer, and use our own client to test your service. We will start with a clean state where the base directory is empty, and we will:

  • Upload a few files into your data store
  • Download those files and make sure that the downloaded files match the originals
  • Upload/download files with blocks in common, and make sure that your client doesn’t unnecessarily download blocks it doesn’t have to.
  • Delete files and verify that the client can no longer download them after they’ve been deleted.

The points are going to be assigned as follows:

  • 100% correctness of uploading, downloading, and deleting files from your store.

We are not providing an autograder since you will be able to run very similar tests as what we’ll do, by simply uploading and downloading various files and checking that they match the originals and don’t get corrupted.

Testing logistics

Testing document

Test 1

  • This first test will ask your client to upload a file. We check that the upload process is successful, Client prints “OK” and no ERROR occurs during the course of the test.

Test 2

  • We then instruct your client to download the same file. We verify that the process was OK and client printed “OK” to console.
  • We “diff” the original file and the downloaded file and verify that they are the same

Test 3

  • Your client will be instructed to upload a a file. We check if the upload is devoid of any errors
  • Your client will be then asked to upload another file with slightly modified contents. Only a few hash blocks will be different from the first file.
  • We check whether the metadata server instructs the client that only those hash blocks are to be stored which are different, since the common blocks are already uploaded to block server in the previous operation.
    • Last but not least we check the output of the client

Test 4

  • We start from scratch. Your download directory will have a file. When your client starts, it is expected to scan the directory and store the file, hash information in your in memory data structure as per the project specification.
  • We start your client and ask it to upload a different file with almost identical contents differing by just a few blocks. Client is expected to upload successfully, print status and end
  • We then issue download request for the recently uploaded file. Here when the client starts up, it hashes the existing file in the download directory which has most of the blocks of the requested file. The client is expected to fetch the file information from metadata server and check if any of the blocks are locally present. If so it should only ask the block server to download the blocks that it doesn’t have locally.
  • Finally diff the contents of the latest file with the original

Test 5: For Part 2 (HW)

  • We ask your client to upload a file.
  • We kill one of your metadata servers
  • Since we still have write quorum number of nodes, we can continue to upload should receive successful responses. We can also continue to update files already written.
    • We then ask the client to download any uploaded files, and compare them via diff to the originals to make sure the files match.

NOTE: We diff the original and downloaded files in all tests to ensure that they are the same We will be checking the amount of data transferred between the servers and client to ensure that only the necessary blocks are communicated.