CSE 124
2017 October 11: Homework 1 faq

Homework 1 FAQ

FAQ

Authorized resources

Just a reminder regarding resources that you can use in your assignments. Anything from the book is fine, as is any code I provide to you in demos, course materials, etc. Standard libraries installed on the ieng6 servers are also fine (e.g., std::string, strncmp, etc).

Code snippets from StackOverflow or similar sites aren’t allowed.

If in doubt, please ask–I’m happy to let you know if a resource is allowed or not.

Updated due date

The homework 1 due date has been pushed back to Wednesday Oct 11th at 5pm.

Printing standard int types

If you want to print 64-bit ints (e.g., uint64_t), add the following to the top of your .c file:

#define __STDC_FORMAT_MACROS
#include <inttypes.h>

An example of printing:

printf("%" PRIu64 "\n", 42l);

Examining non-printable characters like CR and LF.

If you want to examine individual characters from a file, I wrote a simple tool called print-eof-style.sh in the tools/ directory that will tell you tell you about how many lines end with CR-LF.

As I’ll mention in class on Oct 4, you can also run ‘hexdump -C’ on the file and it will give you output that indicates the characters:

This shows that the first line is “SET 5” followed by “\0d” (CR) followed by “\0a” (LF), then the next line.

How to read from the file

The purpose of the homework is to give you practice in finding and processing two-byte delimiters in streams of data (which is a key part of project 1). For that reason, I’d recommend reading the file into a buffer into a fixed-sized buffer, which is going to be most similar to your web server code. Once you open the file, you can use the ‘fread()’ system call to read into e.g., a 128 byte buffer:

uint8_t buf[128];
size_t readAmount;

while (true) {
  readAmount = fread(buf, sizeof(uint8_t), 128, input_stream);
  printf("Got %ld bytes\n", readAmount);

  if (readAmount == 0) {
	printf("Reached the end of the file\n");
	break;
  }
}

Now, you won’t want to print out the “Got … bytes” line or the “reached the end of the file” line. But that basic loop will get you started in terms of looking for the delimiters.

How to read in large files

For this homework, the examples we will provide will not be bigger than 8,192 bytes (8KB). So you can use an 8KB buffer and be sure you’ll get the entire file.

But what if the file that you’re reading is very big, maybe gigabytes or terabytes in size? This can easily occur in networked systems since we don’t always know how much data we’re going to receive over the network. For example, a large high-resolution video might be 10s of gigabytes in size, and we don’t want to create a 10+ GB buffer in memory to hold the entire video. We will study two approaches to addressing this problem in the context of project 1. The first is to read data from the network in fixed-sized chunks, appending those chunks to a dynamically-growing array or vector. The second is to use the sendfile() system call to instruct the operating system to handle the transfer on your behalf.

No negative values

I realized that the example1.txt trigger the number to go negative. We won’t test with any negative numbers in our tests! That means that at no time will your accumulator need to hold a negative number. If you just implement the add/sub/mul instructions, you’ll very likely get the right answer (-1 will be represented by UINT64_T_MAX). But rest assured, our tests won’t cover that.

How to submit

The homework spec has a link that takes you to Github.com. Once you click on that link, the website will clone a repository of starter code for you, and give you permission to access it. Clone that repository to your computer, and work on the assignment.

To submit your work, ‘git add’ all the files that are part of your assignment, and run ‘git commit’ to commit them to your repository. It is a good idea to commit code as you go, so that you can restore your code to a previous point if necessary.

Finally, run ‘git push’ to “push” your code to the repository. This is very important, as “commit” doesn’t actually transfer the data to the github servers. To verify that you’ve successfully pushed your code to the repository, you can visit your repo on github.com using your web browser, and you should see your code/solution there. If you have any questions, please let the TAs or I know!

Guidance

Earlier in this FAQ, I have some code for reading data from a file into a buffer. For HW1, you can assume that the example files are not more than 8KB in size. So if you ‘up’ the buffer size to 8192, then the code I’ve provided will read the file into a buffer.

We can then think of that buffer as a big string containing the data from the input file.

Our first task is to ‘frame’ the data into separate operations (e.g., “ADD 32”, “MUL 1”). These operations are separated by the two-character CR-LF delimiter. You can find this using a for() loop, looking for those particular characters. If you’re using C, you can use the strstr() command. If you’re using C++ you can use the string::find() method.

Now, for each of these operations, you need to parse it, which means (1) figuring out what the command is, and (2) figuring out what the argument is. To find the command, you need to match the first three characters to either ADD, SUB, SET, or MUL. In C, you can do this with the strncmp() command, and in C++ you can use the string::substr() method.

Once you know the command, you need to convert the string representation of the argument (e.g., “42”) into an integer (i.e., 42). Section 5.1 of the book has code to do this, or you can rely on the strtol() command in C, or the std::strtoll method in C++.