Introduction

We investigate a Gaussian latent variable model for semi-supervised learning of linear large margin classifiers. The goal of semi-supervised learning is to build predictive models from small collections of labeled examples but large collections of unlabeled ones. For details, please read our paper .

Publication

Do-kyum Kim, Matthew Der and Lawrence K. Saul.
A Gaussian Latent Variable Model for Large Margin Classification of Labeled and Unlabeled Data.
In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014). Reykjavik, Iceland.
[paper] [supplement]

People

Source code

You can find our implementation at GitHub .

Data sets

These are the data sets we used in the paper. Each tar archive contains a term-document matrix and twelve random splits for different numbers of labeled examples. Each split file is named as '*_splits12_L#.mat', where '*' and '#' denote the name of the data set and the number of labeled examples respectively. In the 'mat' file, the variable 'idxLabs' contains the indexes of the labeled examples in the split; all the others are used as unlabeled examples.
[20-Newsgroups] [ccat] [gcat] [aut-avn] [real-sim] [Freelancer]

Experimental results

These are the experimental results we reported in our paper:
[20-Newsgroups] [ccat] [gcat] [aut-avn] [real-sim] [Freelancer]