[Ryan Huang]

Ryan (Peng) Huang


Email: csmail
Phone: +1 (858) 633-7261
Office: EBU3B 3144

I am a final1 year Ph.D. candidate in the System and Networking Group at UCSD. My advisor is Prof. Yuanyuan Zhou. I received my B.S in computer science and B.A. in economics from Peking University during 2006-2010. I enjoy building systems. My research specializes in improving the quality of systems in new computing platforms. While fighting with bugs is what I frequently do, that's only part of the picture: taming the wild, hardening the fragile, fixing the broken and ultimately making systems operate reliably and efficiently are what I strive for. When I don't hack around, I like hiking around.

[Word Cloud]


June 2015
Intern in configuration management team at Facebook
Jan 2015
Our paper on configuration validation in cloud systems is accepted by EuroSys'15
Sept 2014
TA for Fall '14 CSE 120
July 2014
Our paper on failures and fault-tolerences in cloud services is accepted by OSDI'14
but it is unfortunately withdrawn per request by Microsoft Azure upper management
June 2014
Intern in MSR Redmond and collaborate with Microsoft Azure
Jan 2014
Our paper on improving performance testing efficiency is accepted by ICSE'14
Jan 2014
TA for Winter '14 CSE 221
June 2013
Intern in MSR Redmond and collaborate with Windows Azure
June 2013
Our paper on configuration constraints inference and testing is accepted by SOSP'13
April 2013
Presenting a poster about our eDoctor paper at NSDI'13
March 2013
TA for Spring '13 CSE 120

Recent Projects

Detecting and defending against immature apps


Nowadays there's an (mobile) app for almost anything. Unfortunately, mobile apps are in general weaker in terms of quality compared to traditional software because of developer inexperiences and limited resources. Many apps, despite having useful features, exhibit immature behaviors, e.g., fast battery drain, agressive updates, excessive cellular data usage, notifications.


In the DefDroid project, we target a more generic issue, the Disruptive App Behavior (DAB) problem, and explore the solution at OS level. DefDroid is a mobile OS designed to be more defensive to take care of these issues without disrupting your usability experience. We are actively looking for more users to try DefDroid. If you have an Android device and don't mind flashing a new ROM, please drop me an email.


In the eDoctor project, we address the Abnormal Battery Drain (ABD) problem. eDoctor leverages on a concept of execution phases to capture an app’s time-varying behavior, which can then be used to identify abnormal app. Based on the diagnosis result, eDoctor suggests the most appropriate repair solution to users and automatically fix some of them with user permission.

Validating configurations in cloud-scale service


Misconfigurations remain a major cause of unavailability in large systems despite the large amount of work put into detecting, diagnosing and repairing them. In part, this is because many of the solutions are either post-mortem or too expensive to use in production cloud-scale systems.

The ConfValley project aims to improve an often overlooked process in configuration quality control--configuration validation, which proactively checks configurations against those specifications to prevent misconfigurations from entering production. ConfValley consists of a declarative language for practitioners to express configuration specifications, an inference engine that automatically generates specifications, and a checker that determines if a given configuration obeys its specifications.


Understanding faults and fault tolerences in cloud-scale service


Operating cloud service is a tough job. The sheer scale and complexity of the cloud dictates that fault is ineviatable as a fact of life that has to be dealt with. Failing to handle the faults properly could result in severe service failures and millions of dollors of loss.

In this project, we look into the failures in cloud-scale systems from the perspective of fault tolerances and present a novel framework to understand the failures and impact in cloud services. We apply this methodology on a one-year-snapshot of failures in Microsoft Azure. Our study reveals many interesting findings about failure patterns that are unique in cloud environment.

The paper that summarizes our research results was accepted by OSDI 2014 with high reviews. But unfortunately it was withdrawn from publication per request by the upper management of Microsoft Azure.

Improving performance testing efficiency


Performance testing is a standard practice for evolving systems to detect performance issues proactively. However, there are two main issues that affect the efficiency of performance testing: 1). the high testing overhead prevents testing to be conducted on every commit. 2). the testing can produce a large volume of results that can take a long time to manually analyze.


In the PerfScope project, we propose a new white-box approach, performance risk analysis (PRA), to statically evaluates a given source code commit's risk in introducing performance regression. Performance regression testing can leverage the analysis result to test commits with high risks first while delaying or skipping testing on low-risk commits.


In the CPAoracle project, we leverage machine learning to build an automatic engine that can compare a given set of performance results with a stable baseline and judge whether the measurement data is abnormal. Such automated comparative performance analysis (CPA) can significantly improve the performance result analysis process and achive high accuracy.


EuroSys 2015
ConfValley: A Systematic Configuration Validation Framework for Cloud Services
Peng Huang, Bill Bolosky, Abhishek Singh, and Yuanyuan Zhou
PDF   BibTeX   Slides   Poster
Tech Report
Experience in Building a Comparative Performance Analysis Engine for a Commercial System
Peng Huang, Craig Schechter, Vincent Chen, Steven Hill, Dongcai Shen, Yuanyuan Zhou, and Lawrence K. Saul
UC San Diego Technical Report CS2015-1014, September 2015
PDF   BibTeX
OSDI 2014*
Why Does a Cloud-Scale Service Fail Despite Fault-Tolerance?
Peng Huang, Xinxin Jin, Bill Bolosky, and Yuanyuan Zhou

*: accepted with high review scores but withdrawn from publication per request by Microsoft Azure

ICSE 2014
Performance Regression Testing Target Prioritization via Performance Risk Analysis
Peng Huang, Xiao Ma, Dongcai Shen, and Yuanyuan Zhou
PDF   BibTeX   Slides   Software
SOSP 2013
Do Not Blame Users for Misconfigurations
Tianyin Xu, Jiaqi Zhang, Peng Huang, Jing Zheng, Tianwei Sheng, Ding Yuan, Yuanyuan Zhou, and Shankar Pasupathy
PDF   BibTeX
NSDI 2013
eDoctor: Automatically Diagnosing Abnormal Battery Drain Issues on Smartphones
Xiao Ma, Peng Huang, Xinxin Jin, Pei Wang, Soyeon Park, Dongcai Shen, Yuanyuan Zhou, Lawrence K. Saul, and Geoffrey M. Voelker
PDF   BibTeX   Poster
OSDI 2012
Be Conservative: Enhancing Failure Diagnosis with Proactive Logging
Ding Yuan, Soyeon Park, Peng Huang, Yang Liu, Michael M. Lee, Xiaoming Tang, Yuanyuan Zhou, and Stefan Savage
PDF   BibTeX   Dataset
IMC 2010
Understanding Latent Interactions in Online Social Networks
Jing Jiang, Christo Wilson, Xiao Wang, Peng Huang, Wenpeng Sha, Yafei Dai, and Ben Y. Zhao
PDF   BibTeX
A multiple user sharing behaviors based approach for fake file detection in P2P environments
Jing Jiang, Yongjun Li, Qinyuan Feng, Peng Huang, and Yafei Dai
Science China Information Sciences, November 2010, Vol. 53, Issue 11, pp 2169-2184
PDF   BibTeX


Fall 2014, UCSD
Teaching Assistant for CSE 120 - Principles of Operating Systems
Winter 2014, UCSD
Teaching Assistant for CSE 120 - Advanced Operating Systems
Spring 2013, UCSD
Teaching Assistant for CSE 120 - Principles of Operating Systems
Fall 2009, PKU
Teaching Assistant for Introduction to Computation

Work Experience


1n = 1 + map(lambda x : date.today() > date(2011 + x[0], 9, x[1]), enumerate([19, 24, 23, 29, 21, 19])).index(False)      
# Based on UCSD academic calendars; Assume I will graduate before 2016/09/19.