FOA Home | UP: Probabilisitic retrieval


Odds calculation

In addition to evidence the features provides that a document is \Rel, $Pr(\mathname{Rel} | {\bf x})$, they may also provide evidence that they are not: $\Pr(\overline{\mathname{Rel}} | {\bf x})$ {One way to know that we are talking about a ``rational'' world is to say that: \[\Pr(\mathname{Rel} | {\bf x}) = 1 - \Pr(\overline{\mathname{Rel}} | {\bf x}) \] } An odds calculation balances both probabilities as a ratio: \mathname{Odds}(\mathname{Rel} | {\bf x}) &\equiv& {\Pr(\mathname{Rel} | {\bf x})\over \Pr(\overline{\mathname{Rel}} | {\bf x})} Bayes Rule can also be applied to this ratio: \mathname{Odds}(\mathname{Rel} | {\bf x}) &=& {\Pr(\mathname{Rel} | {\bf x})\over \Pr(\overline{\mathname{Rel}} | {\bf x})} \\ &=&{\Pr(\mathname{Rel})\over \Pr(\overline{\mathname{Rel}})}\cdot {\Pr({\bf x} | \mathname{Rel})\over \Pr({\bf x} | \overline{\mathname{Rel}})} \\ &=&\mathname{Odds}(\mathname{Rel}) \cdot {\Pr({\bf x} | \mathname{Rel})\over \Pr({\bf x} | \overline{\mathname{Rel}})} The first term will be small; the odds of picking a relevant vs. irrelevant document independent of any features of the document are not good. Still, they can be expected to be a characteristic of the entire corpus and insensitive to any analysis we might perform on aparticular documents.

In order to calculate the second term, we need a more refined model of how documents are ``constructed'' from their features.


Top of Page | UP: Probabilisitic retrieval | ,FOA Home


FOA © R. K. Belew - 00-09-21