CSE 222A Term Project
The centerpiece of CSE222A is the term project, which culminates in
a workshop-quality paper and a public presentation. You are
encouraged to think carefully about the project topic and scope, as I
expect that a number of the projects, with some amount of extra
polishing and follow-up work, will be submittable to a high-quality
venue. Indeed, students with successful projects from the two most recent
CSE222A offering earned all-expense-paid trips to present their work
at international conferences in places like Austin, Texas; Hong Kong; and
Bern, Switzerland.
Of course, time is limited. The key to a successful project is to
carefully lay out a plan of action so that some significant
portion---but frequently far from all---of it can be completed in time
to write up and present at the end of the term. Projects will be
graded in the same manner that conference papers are evaluated: we're
looking for interesting insights, clarity of presentation, and
appropriate positioning of your work within the framework of existing
research. From the point of view of the class, the goal of the
project is to give you first hand experience conducting research in
networking and exploring a topic that interests you in more detail
than we well in class.
Each project will be presented orally during the official final
exam period for the course (and possibly in additional sessions as
logistics require). Groups will also submit an 8-10 page research
report describing their efforts. All class members are required to
attend and evaluate their classmates presentations and papers.
Projects should be done in groups of two or three. Exceptions may occasionally be granted; please talk to the instructor.
Timeline
To assist in the timely completion of the project, we have established the following checkpoints.
1/21: Each student is required to submit two or three brief (a
paragraph or two, maximum) project ideas by email to the TA. We will
then post the submitted ideas to the class through Piazza. Students
who have already formed (partial) project groups should submit a
single email naming each of the group members.
1/24: After consulting the project idea list and communicating
privately with other students, students are encouraged to form their
own project group. Each group should send email to the TA with the
members of the group and a brief description of the project idea(s).
2/04: Each group must submit a page-long project proposal. The proposal should contain four sections:
- An introduction, providing a basic overview of the goal.
- Related work, which discusses the current state of the art that
you intend to build upon.
- A schedule, specifying concretely what you intend to have
accomplished by each of the two milestones below as well as for the
final report/presentation. It is perfectly acceptable if the final
deliverable is not a completion of the project---which, if successful,
some members of the group may wish to continue after the term
ends---but it does need to be something that can be clearly
demonstrated/evaluated/graded.
- A description of any resources you believe you will need to
complete the assignment, such as access to testbeds, software,
hardware resources, etc. We expect most of you will be able to
complete your projects on resources already available to you, but if
you have some particular needs please call them out and we'll see what
we can do. Note that if the successful completion of your project
depends on these (as opposed to "it would be nice if") please make
sure you discuss this with Danny and/or the instructor before
submitting.
2/20: Each group will submit a 1-2 page summary of their progress. We will schedule meetings with each project group to discuss the status updates.
3/06: Submit a 1-2 page summary of your progress since the last
checkpoint. In particular, concisely describe the deliverables you
have completed, and provide a brief preview of what you expect to
present during the term project presentations.
3/20: All groups will give a 20-minute presentation, with an
additional 3-4 minutes afterwards for questions. All group members
are expected to participate in the presentation and answer questions.
In addition, students are expected to attend five presentations in
addition to their own and actively participate by asking questions
of the presenters. This will count toward your class participation
grade.
A URL for the presentation sign-up sheet (in Google Docs) is available
on Piazza.
3/21: Final project reports are due by midnight. All groups
are expected to submit an 8-10 page project report in the format of
the papers we've read in class (i.e., double column, single spaced).
You are free to submit in another format, but it is likely the report
will be much longer in, e.g., single column double spaced. The report
should include references and citations to related work, as well as
graphs, figures, etc., documenting the performance of your software
prototype to the extent possible.
Please email your final report, along with either a tarball of or
pointers to (e.g., in github or similar) your code, to Danny and
me.
Project Suggestions
Below is a list of potential project ideas. You are, however,
encouraged to come up with your own: the most successful projects will
be those that pursue topics of interest to the group members. You are
welcome to propose projects that are related to your research
area---even if that is not networking. Please feel free to ask us if
you're not sure whether something you're considering is appropriate.
Also, keep in mind the ideas below are just starting points: they each
need to be refined and focused. And project topics are not
exclusive--it is perfectly OK for multiple groups to work on similar
projects.
Datacenter demand charachterization
We will read several papers that explore architectures for datacenter
network interconnects. Several suggest special purpose handling of
large flows (e.g., provisioning circuits) or dividing traffic across
multiple network fabrics (i.e., a hybrid packet/circuit network). The
effectiveness of these approaches depend greatly on knowing the upcoming
traffic demands. Unfortunately, there is limited information available
regarding the actual traffic demands in real datacenters.
There are at least three interesting project directions here. One could
measure the demands of interesting applications (e.g., Hadoop) and
propose ways to optimize their performance (there is a large literature
here you should explore first). Second, one could instrument end hosts
to determine their upcoming demand in the short term (e.g., by measuring
socket buffers, NIC queues, and the like). Third, one could build
models (perhaps application specific) of traffic demand that could allow
prediction further into the future based upon recent history of a node's
traffic demand.
Evaluating OpenFlow controllers
Many recent research projects have implemented prototypes in OpenFlow,
in the form of increasingly sophisticated OpenFlow controllers. Due
to the centralized nature of OpenFlow, it is not immediately obvious
how such systems will scale as the size and complexity of the networks
being controlled increases. In order to evaluate the performance of
OpenFlow based networks, researchers at Stanford developed MiniNet.
One of the shortcomings of MiniNet,
however, is its inability to run on more than one machine, which
limits the size of the network one can emulate. To address this
shortcoming, my research group is developing an emulation environment
that can emulate larger networks and, eventually, across multiple
machines.
This project would demonstrate the utility of a multi-machine
emulation environment by benchmarking application configurations that
cannot be handled on MiniNet. In particular, a successful project
would show that a MiniNet-based network was either unable to run an
experiment at scale, or, more likely, leads to erroneous results when
compared to the same experiment run across a real network.
Conversely, the project would investigate whether the UCSD emulator
was able to provide more accurate results.
Study the performance of 802.11ac devices
The 802.11 WiFi standard was recently extended to include ac, which
dramatically increases the link rate. However, little has yet been
published about the performance of ac devices, nor their energy
efficiency. One potential project would be to measure the behavior of
available 802.11ac equipment (not currently supported by the CSE APs to
my knowledge) and compare it to the behavior of 802.11n. Are there
interesting differences? Optimizations that can be made?
A TCP bakeoff
There are a zillion (yes, that is a precise number) different TCP
variants currently supported by modern operating systems. In fact, most
of today's OSes use different variants. It would be interesting to
study how well each performs in various circumstances. A project might
compare various flavors of TCP under different simulated network
environments, in terms of performance and perhaps energy usage (of
particular interest for mobile devices). Is TCP fundamentally energy efficient (e.g. in DC networks vs on mobile networks)?
Internet censorship
The Arab Spring has brought Internet censorhip back to light recently.
Other famous examples include the Great Firewall of China. It might be
interesting to study the prevalance of various forms of censorship. A
project might measure Internet censorship between countries and within countries
(e.g. different parts of China censor different parts of the web) using
PlanetLab.
DNS hijacking
A very popular form of redirection (e.g., by captive portals that want
you to explicitly acknowledge Terms of Service) is DNS hijacking. A
project might study how often this occurs, where (e.g. home networks vs starbucks
networks vs PlanetLab nodes), and when. Are there security or
performance concerns that arise because of the observed behavior?
Measuring the Bitcoin P2P network
The Bitcoin virtual currency has been in the news a lot recently. One
of its underpinnings is a peer-to-peer network that broadcasts all
transactions worldwide. A project might study the communications on
this network. Are transactions from the same
cluster broadcast from the same nodes? Where are these nodes? What other
transactions do these nodes broadcast first? Can we put them in the same
cluster? Can we de-anonymize transactions based on the same-node
property?
Use Packetdrill to evaluate interesting application/features
Researchers at Google have recently developed Packetdrill, a tool
for scripting precise tests for network stacks. They used it to find
and fix several bugs in the Linux networking stack. You could use
packetdrill to test the network stack of other OSes (which would require
porting it) or other network devices (think NATs, firewalls, and various
middleboxes). Alternatively, you could use it to design benchmarks for
applications or services (perhaps see the suggestion below regarding
reproducing prior research results).
Replicating prior results
Professor Nick McKeown at Stanford has had students in his class use
the MiniNet network emulator to replicate
published research results. You are welcome to do the same, by
selecting an interesting research paper (either one we've read, one
suggested by the Stanford
course, or any other one that caught your eye, and attempting to
repeat their results. Because you have all term to complete this
project (as opposed to three weeks in the Stanford case) we expect that
you would replicate several experiments from your chosen paper--or
even produce interesting results that were not included in the original paper!
Other suggestions
The previous offering of CSE222A has a long
list of project suggestions (only available from UCSD machines), some of which are no longer timely
(i.e., recent work may have rendered some of the suggestions less
interesting), but most would still make excellent starting points.
Last updated: Tue Mar 04 10:32:54 -0800 2014
[validate xhtml]