Hidden-Unit Conditional Random Fields
Hidden-Unit Conditional Random Fields
Introduction
The hidden-unit conditional random field (CRF) is a model for structured prediction that is more powerful than standard linear CRFs. The additional modeling power of hidden-unit CRFs stems from its binary stochastic hidden units that model latent data structure that is relevant to classification. The hidden units are conditionally independent given the data and the labels, as a result of which they can be marginalized out efficiently during inference. The difference between hidden-unit CRFs and linear CRFs is illustrated in these factor graphs:
Figure 1: Linear CRF.
Figure 2: Hidden-unit CRF.
Hidden-unit conditional random fields are described in detail in the following paper:
• L.J.P. van der Maaten, M. Welling, and L.K. Saul. Hidden-Unit Conditional Random Fields. In Proceedings of the International Conference on Artificial Intelligence & Statistics (AI-STATS), JMLR W&CP 15:479-488, 2011. [ PDF ]
NOTE: Please cite this paper if you use this code!
Related work
The online training algorithms for hidden-unit CRFs are closely related to conditional herding:
• A. Gelfand, L.J.P. van der Maaten, Y. Chen, and M. Welling. On Herding and the Perceptron Cycling Theorem. In Advances of Neural Information Processing Systems (NIPS), volume 23, pages 694-702, 2010. [ PDF ]
The individual predictors in hidden-unit CRFs are so-called discriminative RBMs. Discriminative RBMs can be shown to be universal approximators of p(y|x) for discrete data:
• L.J.P. van der Maaten. Discriminative Restricted Boltzmann Machines are Universal Approximators for Discrete Data. Technical Report EWI-PRB 2011-001, Delft University of Technology, The Netherlands, 2011. [ PDF ]
Legal
Code provided by Laurens van der Maaten, 2011. The author of this code do not take any responsibility for damage that is the result from bugs in the provided code. This code can be used for non-commercial purposes only. Please contact the author if you would like to use this code commercially.
Software
We provide Matlab code that implements the training and evaluation of hidden-unit CRFs, as well as code to reproduce the results of our experiments. The code implements four different training algorithms: (1) a batch learner that uses L-BFGS, (2) a stochastic gradient descent learner, (3) an online perceptron training algorithm, and (4) an online large-margin perceptron algorithm. The code can also be used to perform (conditional) herding in hidden-unit CRFs.
The following files are available for download:
• Matlab code (.zip; 168 KB)
• OCR data set (.zip; 30.8 MB; data set courtesy of Ben Taskar)
• FAQ data set (.zip; 127 KB; data set courtesy of Andrew McCallum)
• CB513 data set (.zip; 2.2 MB)
• Penn Treebank corpus (.zip; 36.0 MB)
Problems / Bugs / Questions?
Feel free to drop me a line.