FOA Home | UP: Dimensionaility Reduction

Other uses of vector space

The same inter-document similarity information captured in the $X = J J^{T}$ matrix can be used for other purposes, too. For example, Section §7.5.1 will discuss one approach to the problem of CLASSIFYING documents known as nearest neighbor.

The $$ captures patterns of keyword usage across a corpus of documents. The preceeding sections have held the corpus constant and used this data to analyze transformations of the keyword dimensions, but the converse is also possible. For example, Section §6.3 will discuss the representation of inter-keyword relationships known as THESAURI . One simple baseline for keywords is their pairwise similarities, as captured by $J$: Y = J^{T} J This produces a $V \times V$ symmetric, square matrix capturing all $V \choose 2$ inter-keyword simularities, exactly analogous to the inter-document similarities of (FOAref) .

Littman has also considered an interesting application of LSI towards the problem of searching across multi-lingual corpora [Littman98] .

Top of Page | UP: Dimensionaility Reduction | ,FOA Home

FOA © R. K. Belew - 00-09-21