DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO


CSE 254: Conditional random fields and related topics

Tutorials and papers


Many thanks to Doug Turnbull and Eric Wiewiora for contributing to the information below.

Each link below should be to a web page where the full text of the paper can be found.  In many cases, other interesting papers are on these web pages also.  Participants in the seminar should feel free to propose papers not on the list here, if these other papers describe high-quality research and are worthwhile to present.  The papers listed here are definitely interesting and worthwhile. 

FOUR TUTORIALS

Hanna M. Wallach.  Conditional Random Fields: An Introduction.  Technical Report MS-CIS-04-21. Department of Computer and Information Science, University of Pennsylvania, 2004.

Charles Sutton and Andrew McCallum.  An Introduction to Conditional Random Fields for Relational Learning.  In Introduction to Statistical Relational Learning. Edited by Lise Getoor and Ben Taskar. MIT Press, 2006. 

Rahul Gupta.  Conditional Random Fields.  Unpublished report, IIT Bombay, 2006.

Roland Memisevic.  An introduction to structured discriminative learning.  Unpublished report, University of Toronto, 2006.

All four surveys above are very good.  The excellent report by Memisevic places CRFs in the context of other methods for learning to predict complex outputs, especially SVM-inspired large-margin methods.

Comments from a student:  "The Wallach tutorial was easy-to-comprehend and provided some high level intuition, but was not comprehensive.  I preferred Sutton's tutorial which provides a long discussion containing many useful and interesting insights.  One conceptual difference between the two tutorials is that Wallach represents CRFs as undirected graphical models, whereas Sutton uses undirected factor graphs.  I prefer factor graphs since they are a very natural and intuitive representation.  Sutton also sets up the comparison between naive Bayes and logistic regression graphical models, and HMMs and Linear-Chain graphical models.  This gives the reader a nice point of comparision if they have experience with NB classifiers and/or HMMs.  I found Section 1.4.2 "Application of CRF's" in Sutton's tutorial to be particularly useful since it provides a relatively current review of current work on CRF broken down by research topic (text-document modeling/NLP, bioinformatics, and Computer Vision).  They also briefly touch on some extension of CRF (dynamic CRFs, multi-label classification)."

For a related course with manky links to papers and other resources, see Topics in Machine Learning: Learning to Predict Structured Objects taught by Thorsten Joachims at Cornell.

RESEARCH PAPERS 

Bibliographies on CRFs have been compiled by Rahul Gupta and Hanna Wallach.  The following papers may be particularly interesting or useful (in approximate chronological order).  Note that several are on topics related to CRFs, not on CRFs directly.

Michael Collins.  Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms.  Proceedings of the ACL-02 conference on Empirical methods in natural language processing, pp.1-8, 2002.  

Sham Kakade, Yee Whye Teh, Sam T. Roweis.  An Alternate Objective Function for Markovian Fields.  ICML 2002.

Andrew McCallum.  Efficiently Inducing Features of Conditional Random Fields.  In Proceedings of the 19th Conference in Uncertainty in Articifical Intelligence (UAI-2003), 2003.

Yasemin Altun and Thomas Hofmann.  Large Margin Methods for Label Sequence Learning.  In Proceedings of 8th European Conference on Speech Communication and Technology (EuroSpeech), 2003.

Sanjiv Kumar and Martial Hebert.  Discriminative random fields: A discriminative framework for contextual interaction in classification.  In Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003.

Ben Taskar, Carlos Guestrin and Daphne Koller.  Max-Margin Markov Networks.  In Advances in Neural Information Processing Systems 16 (NIPS 2003), 2004.

Thomas G. Dietterich, Adam Ashenfelter and Yaroslav Bulatov.  Training Conditional Random Fields via Gradient Tree Boosting.  In Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), 2004.

Xuming He, Richard Zemel, and Miguel Á. Carreira-Perpiñán.  Multiscale conditional random fields for image labelling.  In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), 2004.

Vladimir Kolmogorov and Ramin Zabih.  What Energy Functions can be Minimized via Graph Cuts?  In IEEE Transactions on Pattern Analysis and Machine Intelligence, February 2004.

C. Sutton, A. McCallum.  Collective segmentation and labeling of distant entities in information extraction.  ICML Workshop on Satistical Relational Learning, 2004.

Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, Yasemin Altun.  Large Margin Methods for Structured and Interdependent Output Variables.  JMLR, December 2005.
 
Hal Daumé III, John Langford, and Daniel Marcu.  Search-Based Structured Prediction.  Submitted to Machine Learning, 2006.

Samuel Gross, Olga Russakovsky, Chuong Do, and Serafim Batzoglou.  Training conditional random fields for maximum labelwise accuracy.  In Advances in Neural Processing Systems 19 (NIPS), December 2006.