Full Text Search - The Precision Vs. Recall Tradeoff

The Precision Vs. Recall Tradeoff

Recall measures the quantity of relevant results returned by a search and precision is the measure of the quality of the results returned. Recall is the ratio of relevant results returned divided by all relevant results. Precision is the number of relevant results returned divided by the total number of results returned.

The diagram at right represents a low-precision, low-recall search. In the diagram the red and green dots represent the total population of potential search results for a given search. Red dots represent irrelevant results, and green dots represent relevant results. Relevancy is indicated by the proximity of search results to the center of the inner circle. Of all possible results shown, those that were actually returned by the search are shown on a light-blue background. In the example only one relevant result of three possible relevant results was returned, so the recall is a very low ratio of 1/3 or 33%. The precision for the example is a very low 1/4 or 25%, since only one of the four results returned was relevant.

Due to the ambiguities of natural language, full text search systems typically includes options like stop words to increase precision and stemming to increase recall. Controlled-vocabulary searching also helps alleviate low-precision issues by tagging documents in such a way that ambiguities are eliminated. The trade-off between precision and recall is simple: an increase in precision can lower overall recall while an increase in recall lowers precision.

See also: Precision and recall

Read more about this topic:  Full Text Search

Famous quotes containing the words precision and/or recall:

    Women on trains
    have a life
    that is exactly livable
    the precision of days flashing past
    Audre Lorde (1934–1992)

    There is no greater sorrow than to recall a happy time in the midst of wretchedness.
    Dante Alighieri (1265–1321)