FOA Home | UP: Probabilisitic retrieval


Cost analysis

The discussion of Section §4.3.4 suggests that, as with most real-world decisions, there will be no perfect way to select relevant documents. Even if we can accomplish the PRP and order all the documents, there remains a retrieval threshold to set. If we set the threshold too high, we will not show a user documents they might wish to see, while if we set it too low we will be showing them too many.

But we can capture this tension by associating two COSTS (a.k.a. losses) with each of two possible sources of error. The first cost $C_{RN}$ is incurred when we retrieve an irrelevant document; the second $C_{NR}$ when we don't retrieve a relevant document. To make these costs concrete, we might imagine that there is a limited resource (hitlist screen real estate, user search time), and the first cost is proportional to using up this precious resource. Similarly, the second cost might be proportional to the cost of losing a malpractice suit, when a legal case on point wasn't found but should have been!

In terms of the $$ ranking function of Equation (FOAref) , the trade-off between these two costs can be realized by another term added to the constant $\kappa$ of Equation (FOAref) : \log\,\frac{C_{RN} - C_{NR}}{C_{NR} - C_{RN}} We can easily imagine adding a knob to a browser reflecting the trade-off between these two costs [Russel93] .


Top of Page | UP: Probabilisitic retrieval | ,FOA Home


FOA © R. K. Belew - 00-09-21