Phone: +1 (858) 633-7261
Office: EBU3B 3144
I am a n-th1 year Ph.D. candidate in the System and Networking Group at UCSD. My advisor is Prof. Yuanyuan Zhou. I received my B.S in computer science and B.A. in economics from Peking University during 2006-2010. My research interests lie broadly in systems, specializing in system performance and reliability. While fighting bugs is what I'm frequently doing, that's only part of the picture: taming the wild, hardening the fragile, fixing the broken and ultimately making systems operate as reliably and efficiently as possible are what really fascinate me. When I don't hack around, I like hiking around.
Nowadays there's an (mobile) app for almost anything. Unfortunately, mobile apps are in general weaker in terms of quality compared to traditional software because of developer inexperiences and limited resources. Many apps, despite having useful features, exhibit immature behaviors, e.g., fast battery drain, agressive updates, excessive cellular data usage, notifications.
In the eDoctor project, we address the Abnormal Battery Drain (ABD) problem. eDoctor leverages on a concept of execution phases to capture an app’s time-varying behavior, which can then be used to identify abnormal app. Based on the diagnosis result, eDoctor suggests the most appropriate repair solution to users and automatically fix some of them with user permission.
In the DefDroid project, we target a more generic issue, the Disruptive App Behavior (DAB) problem, and explore the solution at OS level. DefDroid is a mobile OS designed to be more defensive to take care of these issues without disrupting your usability experience. We are actively looking for more users to try DefDroid. If you have an Android device and don't mind flashing a new ROM, please drop me an email.
Misconfigurations remain a major cause of unavailability in large systems despite the large amount of work put into detecting, diagnosing and repairing them. In part, this is because many of the solutions are either post-mortem or too expensive to use in production cloud-scale systems.
The ConfValley project aims to improve an often overlooked process in configuration quality control--configuration validation, which proactively checks configurations against those specifications to prevent misconfigurations from entering production. ConfValley consists of a declarative language for practitioners to express configuration specifications, an inference engine that automatically generates specifications, and a checker that determines if a given configuration obeys its specifications.
In this project, we look into the failures in cloud-scale systems along two independent dimensions: First, we consider the failures from the point of view of fault-tolerance mechanisms and present a novel taxonomy categorizing why the mechanisms may be ineffective; Second, we zoom into the set of faults that underlie failures and the set of root causes that give rise to faults. Applying this methodology on a one-year-snapshot of failures in Microsoft Azure, our study reveals many interesting findings.
The paper that summarizes our research results was accepted by OSDI 2014. But unfortunately it was withdrawn from publication per request by the upper management of Microsoft Azure.
Performance testing is a standard practice for evolving systems to detect performance issues proactively. However, there are two main issues that affect the efficienty of performance testing: 1). the high testing overhead prevents testing to be conducted on every commit. 2). the testing can produce a large volume of results that can take a long time to manually analyze.
In the PerfScope project, we propose a new white-box approach, performance risk analysis (PRA), to statically evaluates a given source code commit's risk in introducing performance regression. Performance regression testing can leverage the analysis result to test commits with high risks first while delaying or skipping testing on low-risk commits.
In the CPAoracle project, we leverage machine learning to build an automatic engine that can compare a given set of performance results with a stable baseline and judge whether the measurement data is abnormal. Such automated comparative performance analysis (CPA) can significantly improve the performance result analysis process and achive high accuracy.
*: accepted but withdrawn from publication per request by Microsoft Azure
PhD intern, 06/2015 - 09/2015
Configuration management team
Research intern, 06/2014 - 09/2014
Mentor: Bill Bolosky
Research intern, 06/2013 - 09/2013
Mentor: Bill Bolosky
Part-time research intern, 11/2011 - 12/2012
Performance testing team
Software engineer intern, 03/2010 - 06/2010
Web features team