Charles Elkan
I am a professor in the computer science and engineering
department at the University of California, San Diego. My
doctorate is in computer science from Cornell University, with
a graduate minor in economics. As a graduate student I also
spent time at Stanford University, and before joining UCSD I
was a postdoc at the University of Toronto. My undergraduate
degree is in mathematics from Cambridge University, with a
focus on statistics and optimization. In 1998/99 I was a
visiting associate professor at Harvard University. For more
information see this
curriculum vitae.
My main research interests are in machine learning, data
mining, and analytics. I am interested especially in
applications to business and to biomedicine. Research in my
group is funded by an R01 grant from the National Institutes
of Health, by a grant from the University of California
National Lab cooperation program, and by gifts from multiple
companies.
In the winter quarter of 2012, I am teaching CSE
250B, a
graduate course on machine learning, and organizing the UCSD
AI seminar. In the
spring quarter I will teach CSE
255, a
graduate course on analytics and data mining.
For a complete list of publications, with links to full
papers, see
DBLP
and my Google Scholar
profile.
Link Prediction via Matrix Factorization
A. K. Menon and C. Elkan
In
Proceedings of the European Conference on Machine Learning
(ECML), September 2011
pdf
We show how to learn to predict
which edges exist in a social network (or other graph)
using a matrix factorization approach. The new method
learns latent features that capture the structure of a
network, and combines these with explicit
side-information. The algorithm directly optimizes a
ranking loss, and scales to very large networks. Results
on many social and other networks show the
superior accuracy of the new method.
Nonlinear Support Vector Machines Can Systematically
Identify Stocks with High and Low Future Returns
R. Huerta, C. Elkan, and F. Corbacho
Published
as SSRN paper number 1930709, September 19, 2011
pdf
This paper rigorously develops a
reliable model to identify stocks with high and low
future returns. Technical and fundamental features are
computed using CRSP and Compustat data. From 1981 to
2010, taking into account realistic trading costs and
constraints, the model leads to annual Jensen alpha over
10% with standard deviation 8%.
Preserving Privacy in Data Mining via Importance Weighting
C. Elkan
In Proceedings of the Workshop on
Privacy and Security Issues in Data Mining and Machine
Learning (PSDML), September 2010
pdf
This paper presents a
fundamentally new approach to protecting privacy. Let D be a confidential
database, and let E be
a public database with a similar schema. We compute a
weight w(x) for each record x in E that measures how
representative it is of the records in D. Learning on E using these
weights is then essentially equivalent to learning
directly on D,
but D is kept
private.
Accounting for Word Burstiness in Topic
Models
G. Doyle and C. Elkan
In
Proceedings of the 26th International Conference on
Machine Learning (ICML), July 2009
pdf
A fundamental property of
language is that if a word is used once, then it is more
likely to be used again. Previous topic models fail to
capture this burstiness phenomenon. This paper presents
a topic model that uses Dirichlet compound multinomial
distributions to model burstiness. The new model
achieves better goodness of fit in text mining with far
fewer topics than standard latent Dirichlet allocation
(LDA).
Learning to make predictions in
networks. Department of Computer Science and
Automation, Indian Institute of Science, Bangalore, December 21,
2011.
The analytics landscape: A
personal view.
Indo-US
Workshop on Large Scale Data Analytics and Intelligent
Services, Bangalore, December 20, 2011.
pdf
A vision for reinforcement
learning and predictive maintenance. Keynote talk,
Workshop on Data Mining for Service and Maintenance, ACM
International Conference on Knowledge Discovery in Databases
(KDD), August 21, 2011.
pdf
Some sources that have interviewed me, or that have mentioned
research from my group. Click a logo to read an article.