FOA Home | UP: Weighting and matching against Indices


Matching queries against documents

In Chapter 2 we first identified documents, and then lexical features to be associated with each. Then we built an inverted keyword list to make going from keywords to documents about those keywords as easy as possible. Now we become specific about how we measure the similarity between a document and a query.

The discussion of matching queries and documents is simplified if we adopt the vector space perspective of Section §3.4 and imagine both the query $\mathbf{q}$ and all documents $\mathbf{d}$ to be vectors in a space of dimensionality equal to the $\mathname{NKw}$, the keyword vocabulary size.

In this space, the answer to the question of which documents are the best match for a query seems straightforward - those documents which are most similar, relative to some particular METRIC $\mathname{Sim}$ measuring distance between points in the space. Students of algebra and abstract vector spaces know that a wide range of choices are possible; see Section §5.2.2 .

Subsections


Top of Page | UP: Weighting and matching against Indices | ,FOA Home


FOA © R. K. Belew - 00-09-21