DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO
CSE 254: Conditional random fields (CRFs) and related topics
Software for log-linear models and CRFs
LOG-LINEAR MODELS
The Tsujii lab at the University of Tokyo has well-documented software named Amis and SS Maxent
for training log-linear classifiers. However, I recommend writing
your own code that uses an off-the-shelf general nonlinear optimization
package, for example L-BFGS.
NONLINEAR OPTIMIZATION
A
common opinion among machine learning researchers is that the
limited-memory BFGS implementation by Jorge Nocedal is the best-performing. Liam Stewart has published a Matlab wrapper
for this code. You may also use the nonlinear optimization
routines in the Matlab optimization toolbox. An interesting
method that has never been applied in machine learning but might be
useful is ve08,
since it is designed for objective functions that consist of components
that are added together where each component only uses a small subset
of all variables.
Ian Fasel provided the following information: "I spent a few
moments figuring out how to build lbfgs on Mac OS X (since the
developer tools don't come with a built-in fortran compiler.) You
can buy the Absoft compilers, but it seems that gfortran works fine
with Matlab 7. The following instructions seem to work -- though
I've only tried testing it with tst_lbfgs.m so far. I'm using a
G5 PowerMac, Tiger, things could be different for an Intel based mac.
1) Grab MATLAB LBFGS wrapper v1.1 http://www.cs.toronto.edu/~liam/software.shtml
2) Get routines.f from the site linked there.
3) Follow the instructions on http://hpc.sourceforge.net/ to install gfortran on your computer.
4) cd to the directory lbfgs-1.1/lbfgs, then copy a mexopts.sh file to
here, e.g., cp /Applications/MATLAB73/bin/mexopts.sh .
5) Modify the local mexopts.sh file. Find the section labeled "mac)" and change as follows:
#FC='f77'
#FFLAGS='-f -N15 -N11 -s -Q51 -W'
#ABSOFTLIBDIR=`which $FC | sed -n -e '1s|bin/'$FC'|lib|p'`
#FLIBS="-L$ABSOFTLIBDIR -lfio -lf77math"
#FOPTIMFLAGS='-O -cpu:g4'
FC='/usr/local/bin/gfortran'
FFLAGS=
FLIBS="-L/usr/local/lib -lgfortran"
FOPTIMFLAGS='-O5 -funroll-loops -ftree-vectorize'
In other words, comment out the Absoft stuff, and replace with gfortran
stuff. You may want to experiment with optimization flags."
CONDITIONAL RANDOM FIELDS
Hanna Wallach has compiled a list of CRF-related software packages; Andrew McCallum's Mallet written in Java is recommended. CRF++
by Taku Kudo, written in C++, is good also.
Try also an extended version of CRF++ with stochastic gradient
optimization named crfsmd.
Other packages, likely to be good also: CRFs in Java by Sunita Sarawagi, and large-scale parallel flexible CRFs by Xuan-Hieu Phan and
Le-Minh Nguyen.
OTHER SOFTWARE
For a non-CRF approach to structured prediction see Searn by Hal Daumé. For finding the optimal labels for nodes in special cases of very large random fields, see the Middlebury MRF Energy Minimization Page.