Current Work
I am currently developing the world's fastest sorting system, TritonSort.
TritonSort aims to achieve record speeds by focusing on per-disk and per-node
efficiency. We measure ourselves in terms of MB/s/disk rather than raw sorting
throughput. The goal of TritonSort is to sort data at the speed of the disks,
by keeping all disks constantly reading or writing. TritonSort competed in the
2010 Sortbenchmark.org competition for
GraySort and MinuteSort in the Indy category.
We are expanding TritonSort to handle general purpose
computation. We have an implementation of MapReduce that shares a
significant portion of TritonSort's code base. Although we do not make the same
fault tolerance guarantees that traditional MapReduce systems provide, we believe that our implementation demonstrates that
TritonSort's efficiency improvements go beyond sorting and are applicable to real
workloads.
Past Work
I held a Software Engineering Intern position at Google during the summer of
2010 working in the search infrastructure group with my mentor
Alexander Yip.
During summer of 2009, I investigated balanced systems
within the MapReduce framework of Hadoop.
Goals consisted of analyzing cluster resources during a MapReduce job,
identifying bottlenecks, and classifying various types of jobs according to
these bottlenecks with the hope of being able to utilize cluster resources more
efficiently.