Office: (310) 206-6069
Department of Computer Science Engineering
Fax: (310) 825-2273
University of California, San Diego
Email: h3kang AT cs DOT ucsd DOT edu
Recent Update: Currently at Ann ArborI've recently graduated from UCSD and currently I am affiliated in Biostatistics Department at University of Michigan, Ann Arbor as a Research Assistant Professor. This page is no longer maintained, and recent information of mine can be found HERE
Welcome to Hyun Min Kang's Home!Hello, I am a graduate student in the Department of Computer Science and Engineering at the University of California, San Diego. Currently, I am working at UCLA with my research advisor Eleazar Eskin who is jointly affiliated with Computer Science and Human Genetics departments at UCLA since 2007. Previously, I was a research fellow at Genome Research Center for Diabetes and Endocrine Disease, Seoul National University Hospital. Prior to that, I finished my Bachelor's and Master's degrees at School of Electrical Engineering, Seoul National University with database specialization, under the advise from Sang K. Cha.
Currently, my primary research topic is "Effective design and analysis of systems genetics studies", which is also the title of my dissertation. One of the major challenges in the systems genetics research is that many unmodeled (and typically undocumented) factors in these data sets often largely confound the discovery of true biological signals, resulting in excessive false positives that do not replicate between independent studies. It is statistically important both to detect and to correct for such unmodeled confounding factors to more accurately infer causal relationships among the elements of the biological system. I have observed that a vast majority of currently available high-throughput biological data sets for systems genetics research are susceptible to such confounding effects. For example, population structure and expression heterogeneity are widely known confounding factors in association mapping and expression data analysis, respectively - yet no comprehensive solution for them has been proposed. I have contributed this area by proposing novel methods that robustly resolve such confounding factors through "confounding-aware" statistical analysis in various contexts of systems genetics studies, and I am still enjoying working on them, hoping to make a breakthrough in the systems genetics studies.
I am also interested in many computational and statistical challenges in sequenced-based high-throughput biological data. In particular, I am interested in a comprehensive understanding of genetic variation structure of humans and many other model organisms. I have analyzed the haplotype structure among inbred mouse strains from both the NIEHS/Perlegen resequencing projects and mouse HapMap project. Through these projects, I was able to unveil many interesting patterns of genomic variations among laboratory mouse strains. I have been also working on the method imputing unobserved genotypes and haplotypes by leveraging the genetic variation structure. I believe that the comprehensive understanding of the genetic variation will facilitate dissecting the genetic basis of the complex traits, which is the ultimate goal of my research.
- HYUN MIN KANG, Noah Zaitlen, Buhm Han and Eleazar Eskin, “An adaptive and memory efficient algorithm for genotype imputation”, In Proceedings of the Thirteenth Annual Conference on Research in Computational Biology (RECOMB-2009) Tucson, Arizona: May 18th-21st, 2009 (To appear)
- Eun Yong Kang, HYUN MIN KANG, Chun Ye, Ilya Shipster, and Eleazar Eskin, “Detecting the presence and absence of causal relationships between expression of yeast genes with very few samples”, In Proceedings of the Thirteenth Annual Conference on Research in Computational Biology (RECOMB-2009) Tucson, Arizona: May 18th-21st, 2009 (To appear)
- Noah A. Zaitlen, HYUN MIN KANG, and Eleazar Eskin, “Linkage effects and analysis of finite sample errors in the HapMap”, Human Heredity (in press)
- HYUN MIN KANG, Chun Ye, and Eleazar Eskin, “Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots”, Genetics 180:1909-25, 2008 [Pubmed]
- Buhm Han, HYUN MIN KANG, Myeong Seong Seo, Noah A. Zaitlen, and Eleazar Eskin, “Efficient association study design via power-optimized tag SNP selection”, Annals of Human Genetics 72:834-47, 2008, [Pubmed]
- Anatole Ghanzalpour, Sudheer Doss, HYUN MIN KANG, Charles Farber, Ping-Zi Wen, Alec Brozell, Ruth Castellanos, Eleazar Eskin, Desmond J. Smith, Thomas A. Drake, and Aldon J. Lusis, “High-resolution mapping of gene expression using association in an outbred mouse stock”, PLoS Genetics 4:e1000149, 2008, [Pubmed]
- HYUN MIN KANG, Noah A. Zaitlen, Claire M. Wade, Andrew Kirby, David Heckerman, Mark J. Daly, and Eleazar Eskin, “Efficient control of population structure in model organism association mapping”, Genetics, 178:1709-23, 2008, [Pubmed]
- Kelly A. Frazer, Eleazar Eskin, HYUN MIN KANG, Molly A. Bogue, David A. Hinds, Erica J. Beliharz, Robert V. Gupta, Julie Montgomery, Matt M. Morenzoni, Geoffrey B. Nilsen, Charit L. Pethiyagoda, Laura L. Stuve, Frank M. Johnson, Mark J. Daly, Claire M. Wade, and David R. Cox, “A sequence-based variation map of 8.27 million SNPs in inbred mouse strains”, Nature 448:1050-3, 2007, [Pubmed]
- Noah A. Zaitlen, HYUN MIN KANG, Eleazar Eskin, and Eran Halperin, “Leveraging the HapMap correlation structure in association studies”, American Journal of Human Genetics, 80:683-91, 2007, [Pubmed]
- Chun Ye, Matthew Zapala, HYUN MIN KANG, Jennifer Wessel, Eleazar Eskin, and Nicholas Schork "High-density QTL mapping to identify phenotypes and loci influencing gene expression patterns in entire biochemical pathways", In Proceedings of the Second RECOMB Satellite Workshop of Systems Biology San Diego, California: December 1st-2nd, 2006
- Noah A. Zaitlen, HYUN MIN KANG, Michael L. Feolo, Stephen T. Sherry, Eran Halperin, and Eleazar Eskin, “Inference and analysis of haplotypes from combined genotyping studies deposited in dbSNP”, Genome Research, 15:1594-600, 2005, [Pubmed]
Resources and Software
- EMMA (Efficient Mixed Model Association)
Correcting for complex population structure and genetic relatedness in association mapping.
- ICE (Inter-sample Correlation Emended) association mapping
Resolving the expression heterogeneity in the expression analysis from high-throughput data sets.
- EMINIM (Expectation-Maximized INtegrative IMputation)
An adaptive and memory-efficient imputation of unobserved genotypes
- Mouse HapMap Resource
A high-density haplotype resource of 94 inbred mouse strains
- NIEHS/Perlegen mouse resequencing resource
Genotype and haplotype resource of sequence-based 8.27 million SNPs
- Mouse Phenome Association Database (MPAD)
Association mapping results between mouse HapMap SNPs and mouse phenome database (MPD)