The room for the AI seminar is 4140 in the new CSE building on the UCSD campus. Each talk is on a Monday at noon. Free pizza is served; many thanks to Gary Cottrell for providing the funds to continue this tradition. (Note to speakers: If the pizza arrives late, the organizer will ask you to pause at an appropriate moment to allow attendees to get pizza.)
Everyone interested in the AI seminar should join the UCSD AI mailing list. See here for the Winter 2007 schedule. The AI seminar will continue in the summer; see the schedule below.
| date | speaker (* means not confirmed) |
|
title or
topic |
| March 26 | Peter Sunehag | I received my Ph.D. in mathematics from Uppsala University in 2003. The research field of the thesis was theoretical functional analysis. During 2004 and the first half of 2005 I was a mathematics lecturer at Uppsala University. After that I went to Canberra, Australia for a one year visit at the Statistical Machine Learning program of National ICT Australia where I am now employed as a researcher. I am working on NICTA's Document Analysis and Understanding project together with S.V.N. Vishwanathan and Alex Smola. | Using
two-stage conditional word frequency models that capture word
burstiness to motivate TF-IDF Abstract: Several authors have recently studied the problem of creating exchangeable models for natural languages that exhibit word burstiness. Word burstiness means that a word that has appeared once in a text should be more likely to appear again than it was to appear in the first place. In this talk the different existing methods are compared theoretically through a unifying framework. New models that do not satisfy the exchangeability assumption but whose probability revisions only depend on the word counts of what has previously appeared, will be introduced within this framework. I will refer to these models as two-stage conditional presence/abundance models since they, just like some recently introduced models for the abundance of rare species in ecology, separate the issue of presence from the issue of abundance when present. We will see that the widely used TF-IDF heuristic for information retrieval is closely related to these models. |
| April 2 | Chuong
Do Room change today only: CSE 1202 |
Chuong (Tom) Do is a Ph.D. student at Stanford working with Serafim Batzoglou and Andrew Ng. | RNA
secondary structure prediction without physics-based models Abstract: RNA secondary structure refers to the pattern of hydrogen bonds that form between complementary nucleotides within a single RNA sequence. For several decades, the most accurate RNA secondary structure prediction techniques (e.g. Mfold) have relied on "free energy minimization" -- building models of short-range physical interactions in RNA molecules, and computing structures with estimated minimum free energy. Over the past five years, a number of statistical approaches based on stochastic context free grammars (SCFGs) have been proposed for the secondary structure prediction problem; however, none have managed to match the accuracies of the best traditional methods. In this talk, we describe CONTRAfold, a technique for RNA secondary structure prediction based on conditional log-linear models (CLLMs). Unlike previous attempts at applying machine learning to RNA secondary structure prediction, CONTRAfold achieves the best performance on this problem to date. Our result thus closes the gap between probabilistic and physics-based models, demonstrating that statistical learning procedures provide an effective alternative to empirical measurement of thermodynamic parameters for RNA secondary structure prediction. |
| April 9 |
Gert Cauwenberghs | Gert Cauwenberghs is Professor of Biology at UCSD. | Silicon
Learning Machines Abstract: Learning and adaptation are key to biological and artificial intelligence in complex and variable environments. Our research has investigated adaptive microsystems in silicon for real-time signal and information processing that embed learning mechanisms at the cellular and sensory level and attain high levels of efficiency in large-scale parallel computation. The talk will focus on kernel learning machines for pattern recognition from sparse training examples, which include support vector machines and extend to sparse probability estimation and maximum a posteriori forward sequence decoding. Implemented in subthreshold CMOS analog and adiabatic charge-mode mixed-signal VLSI, these learning systems-on-chips offer throughput and energy efficiency in the TeraMACS per milliwatt range. Applications will be illustrated with examples in vision and speech processing. |
| April 16 |
Christian Shelton |
Christian
Shelton is an Assistant Professor of Computer Science at UC
Riverside. He earned his PhD from MIT in 2001.
Since then
he has been a post-doc at Stanford and a visiting faculty member at
Intel Research. He has been the managing editor of the
Journal of
Machine Learning Research since 2003. His research is in the
areas of machine learning and dynamic systems. |
Continuous
Time Bayesian Networks and Network Traffic Monitoring Abstract: Most real-world dynamic systems evolve in continuous time. Modeling them in discrete time (by time slicing) introduces sampling artifacts and instability. In this talk, I will describe continuous time Bayesian networks (CTBNs) as a structured model for continuous time systems. The talk will focus on the representation and some of the learning and inference algorithms for CTBNs. I will conclude with applications of CTBNs to network traffic monitoring and intrusion detection in particular. |
| April 23 | Rik Belew | Rik Belew is now Professor of Cognitive Science at UCSD after transferring from CSE at UCSD a few years ago. | Using hidden Markov models (HMMs) to predict HIV evolution from clinical records Abstract: Even if accurate testing of each patient were possible, if doctors were able to perfectly apply the best and most timely information concerning combinations of modern retrovirusal agents, if the cost of complete patient testing and subsequent drug therapy was of no concern, ... the evolutionary dynamism of the HIV virus would almost guarantee that strains resistant to all available drug therapies will some day threaten some patients. With 50 million currently infected patients, generations of future patients force us to consider a broader "game" in which our species develops new therapeutic agents and combinations of agents, which will inevitably be followed by world-wide evolutionary "moves" by the virus. There is now sufficient data concerning varied evolutionary responses to drug treatment regimes across large patient populations to consider longitudinal trends. Clinical records are used to model patients' progressions as trajectories through a state-space characterized by genotypic assays and other available CD4 and viral load test results. We use Hidden Markov Model (HMM) techniques with this training set to build statistical characterizations of these trajectories that allow, for example, prediction of underlying viral genotypes in the patient at times other than that of the performed assay. |
| April 30 | Jonathan Pearce | Jonathan Pearce is a fourth-year Ph.D. student in computer science at the University of Southern California in the TEAMCORE research group, led by Prof. Milind Tambe. He received his bachelor's and master's degrees in computer science from MIT. He is interested in multiagent systems, distributed constraint reasoning, and game theory. | Local Optimization in Cooperative Agent Networks Abstract: In a large class of multi-agent scenarios, the effect of local interactions between agents can be compactly represented as a network structure such as a distributed constraint optimization problem (DCOP). Collaboration in large networks of agents is often difficult to achieve; often agents can only manage to collaborate in smaller subgroups, in order to find a workable solution in a timely manner. The goal of my research is first, to provide algorithms to enable agents that are bounded in this way to quickly find high-quality solutions, and second, to provide theoretical results in order to understand key properties of these solutions. In particular, my research considers the case in which agents repeatedly optimize in small groups until a local optimum has been reached, for which no group of a certain size can possibly improve the solution. In this talk, I will present several key results about locally optimal DCOP solutions and algorithms, including new guarantees on their solution quality respect to the globally optimal solution. Relevant domains for my work include personal assistant agents, sensor networks, and teams of autonomous robots. |
| May 7 | Mike
Houston |
Mike Houston is a Ph.D. student at Stanford. He earned his B.S. in computer science at UCSD in 2001. | Hidden Markov model implementation for manycore processors Abstract: I'll give a quick intro into GPGPU programming and talk about the implementation of ClawHMMer, a streaming hidden Markov model search implementation. The concentration will be on the conversion of the traditional algorithm into a data parallel, streaming expression suitable for parallel processors. I'll talk about how this maps to GPUs, Cell, clusters of GPUs, clusters of PS3s, and why hmmpfam is tricky to implement because of state tracking requirements. |
| May 14 | Emanuel Todorov | Emo Todorov is Assistant Professor of Cognitive Science at UCSD. | Linearly-solvable Markov decision problems We introduce a class of MDPs which greatly simplify Reinforcement Learning. They have discrete state spaces and continuous control spaces. The controls have the effect of rescaling the transition probabilities of an underlying Markov chain. A control cost penalizing KL divergence between controlled and uncontrolled transition probabilities makes the minimization problem convex, and allows analytical computation of the optimal controls given the optimal value function. An exponential transformation of the optimal value function makes the minimized Bellman equation linear. Apart from their theoretical significance, the new MDPs enable efficient approximations to traditional MDPs. Shortest path problems are approximated to arbitrary precision with largest eigenvalue problems, yielding an O(n) algorithm. Accurate approximations to generic MDPs are obtained via continuous embedding reminiscent of LP relaxation in integer programming. Off-policy learning of the optimal value function is possible without need for state-action values; the new algorithm (Z-learning) outperforms Q-learning. |
| May 21 | Rob Malouf | Rob Malouf is Assistant Professor of Linguistics and a faculty member in the Computational Linguistics MA Program at San Diego State University. His Ph.D. is from Stanford. | User classification for informal on-line political discourse Weblogs, message boards, chat rooms logs, and other informal web texts are an incredibly rich knowledge base. However, the nature of these texts often makes them unsuitable for analysis using many standard natural language processing methods. This talk will describe on-going work in classification of participants in informal on-line political discourse. Lexically-oriented methods, such as sentiment analysis and text classification, have worked well in other domains but perform poorly on this task. We have achieved better results by augmenting these methods with social network analysis, exploiting information about the structure of the on-line community and the ways that users interact with each other. The same interactive quality of the texts which creates challenges for more traditional NLP techniques also creates opportunities for new types of analysis. |
| May 28 | Memorial Day |
no talk |
|
| June 4 | Filippo Menczer | Filippo Menczer is Associate Professor in the School of Informatics at Indiana University, Bloomington. He earned his Ph.D. in CSE at UCSD. | Googlearchy
or democracy? How search affects Web traffic and evolution Abstract: Search engines have become key media for our scientific, economic, and social activities by enabling people to access information on the Web in spite of its size and complexity. On the down side, search engines bias the traffic of users according to their page-ranking strategies, and some have argued that they create a vicious cycle that amplifies the dominance of established and already popular sites. We show that, contrary to these prior claims and our own intuition, the use of search engines actually has an egalitarian effect. We reconcile theoretical arguments with empirical evidence showing that the combination of retrieval by search engines and search behavior by users mitigates the attraction of popular pages, directing more traffic toward less popular sites, even in comparison to what would be expected from users randomly surfing the Web. We then extend the analysis of traffic to a general model of search-driven network growth that predicts the topological properties of the Web graph. Joint work with Santo Fortunato, Alessandro Flammini, and Alessandro Vespignani. |
| July 2 | Geoff Webb | Geoff Webb is Professor of Computer Science at Monash University in Melbourne, Australia, and Editor-in Chief of the Data Mining and Knowledge Discovery Journal. | On the efficacy of Occam's Razor as a model selection criterion for classification learning Abstract: The principle of Occam's (alternatively Ockham's) Razor is deeply entrenched in Western thought. In the machine learning context, it has often appeared as a belief that expected error is minimized by selecting the least complex of two classifiers, all other things being equal. In this talk I argue that this is false, presenting both theoretical and experimental support. I also provide an alternative model selection criterion, based on generality, that can effectively select between models when all other things are equal. |
| July 16 | Dashan Gao | Dashan Gao is a Ph.D. candidate in ECE at UCSD. His adviser is Nuno Vasconcelos. | A
Discriminant Framework For Saliency Detection Abstract: It has long been known that saliency-based attention mechanisms play an important role for biological visual systems to perform complex tasks. In this work, we propose a new hypothesis that saliency is a discriminant process: salient visual attributes are those that most help visual systems decide between different hypotheses regarding the nature of the visual stimuli which they are subject to. We have shown that this formulation leads to a top-down saliency detector that outperforms the classical detectors for visual recognition problems. In this work, we seek the connections between this hypothesis and biological vision, for which we implement a bottom-up saliency detector with resort to "center-surround" mechanism and natural image statistics. This saliency model can not only be implemented in a biologically plausible network, but also it explains some important properties in both neurophysiology and psychophysics aspects of biological vision, such as the divisive normalization structure of simple cells in the primary visual cortex (V1) and the phenomenon of asymmetries in visual search. Finally, it provides a holistic justification for the functionality of V1. |
Most recently updated on May 22, 2007 by Charles Elkan, elkan@cs.ucsd.edu