Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

ireval::RetrievalEvaluator Class Reference

List of all members.

Public Member Functions

 RetrievalEvaluator (String queryName, List< Document > retrieved, Collection< Judgment > judgments)
String queryName ()
double[] precisionAtFixedPoints ()
double[] interpolatedPrecision ()
double precision (int documentsRetrieved)
double recall (int documentsRetrieved)
double rPrecision ()
double reciprocalRank ()
double averagePrecision ()
double binaryPreference ()
double normalizedDiscountedCumulativeGain ()
double normalizedDiscountedCumulativeGain (int documentsRetrieved)
int relevantRetrieved (int documentsRetrieved)
ArrayList< DocumentretrievedDocuments ()
ArrayList< DocumentjudgedIrrelevantRetrievedDocuments ()
ArrayList< DocumentirrelevantRetrievedDocuments ()
ArrayList< DocumentrelevantRetrievedDocuments ()
ArrayList< DocumentrelevantDocuments ()
ArrayList< DocumentrelevantMissedDocuments ()

Static Public Member Functions

int[] getFixedPoints ()

Protected Member Functions

double normalizationTermNDCG (int documentsRetrieved)

Private Member Functions

void _buildJudgments (Collection< Judgment > judgments)
void _judgeRetrievedDocuments ()
void _findMissedDocuments ()
void _findRelevantDocuments ()

Private Attributes

String _queryName
ArrayList< Document_retrieved
ArrayList< Document_judgedMissed
ArrayList< Document_relevant
ArrayList< Document_relevantRetrieved
ArrayList< Document_judgedIrrelevantRetrieved
ArrayList< Document_irrelevantRetrieved
ArrayList< Document_relevantMissed
HashMap< String, Judgment_judgments
int _numIrrelevant
double[] _pFP = null
double[] _ip = null

Static Private Attributes

int[] fixedPoints = { 5, 10, 15, 20, 30, 100, 200, 500, 1000 }

Detailed Description

A retrieval evaluator object computes a variety of standard information retrieval metrics commonly used in TREC, including binary preference (BPREF), geometric mean average precision (GMAP), mean average precision (MAP), and standard precision and recall. In addition, the object gives access to the relevant documents that were found, and the relevant documents that were missed.

BPREF is defined in Buckley and Voorhees, "Retrieval Evaluation with Incomplete Information", SIGIR 2004.

Author:
Trevor Strohman


Constructor & Destructor Documentation

ireval::RetrievalEvaluator::RetrievalEvaluator String  queryName,
List< Document retrieved,
Collection< Judgment judgments
[inline]
 

Creates a new instance of RetrievalEvaluator

Parameters:
retrieved A ranked list of retrieved documents.
judgments A collection of relevance judgments.


Member Function Documentation

void ireval::RetrievalEvaluator::_buildJudgments Collection< Judgment judgments  )  [inline, private]
 

void ireval::RetrievalEvaluator::_findMissedDocuments  )  [inline, private]
 

void ireval::RetrievalEvaluator::_findRelevantDocuments  )  [inline, private]
 

void ireval::RetrievalEvaluator::_judgeRetrievedDocuments  )  [inline, private]
 

double ireval::RetrievalEvaluator::averagePrecision  )  [inline]
 

Returns the average precision of the query.

Suppose the precision is evaluated once at the rank of each relevant document in the retrieval. If a document is not retrieved, we assume that it was retrieved at rank infinity. The mean of all these precision values is the average precision.

double ireval::RetrievalEvaluator::binaryPreference  )  [inline]
 

The binary preference measure, as presented in Buckley, Voorhees "Retrieval Evaluation with Incomplete Information", SIGIR 2004. This implemenation is the 'pure' version, which is the one used in Buckley's trec_eval (v 8 with bpref bugfix).

The formula is: 1/R {r} 1 - |n ranked greater than r| / min(R, N) where R is the number of relevant documents for this topic, N is the number of irrelevant documents judged for this topic, and n is a member of the set of first R judged irrelevant documents retrieved.

int [] ireval::RetrievalEvaluator::getFixedPoints  )  [inline, static]
 

double [] ireval::RetrievalEvaluator::interpolatedPrecision  )  [inline]
 

ArrayList<Document> ireval::RetrievalEvaluator::irrelevantRetrievedDocuments  )  [inline]
 

This method returns a list of all documents that were retrieved but assumed to be irrelevant. This includes both documents that were judged to be irrelevant and those that were not judged at all. The list is returned in retrieval order.

ArrayList<Document> ireval::RetrievalEvaluator::judgedIrrelevantRetrievedDocuments  )  [inline]
 

Returns:
The list of all documents retrieved that were explicitly judged irrelevant.

double ireval::RetrievalEvaluator::normalizationTermNDCG int  documentsRetrieved  )  [inline, protected]
 

double ireval::RetrievalEvaluator::normalizedDiscountedCumulativeGain int  documentsRetrieved  )  [inline]
 

Normalized Discounted Cumulative Gain

This measure was introduced in Jarvelin, Kekalainen, "IR Evaluation Methods for Retrieving Highly Relevant Documents" SIGIR 2001. I copied the formula from Vassilvitskii, "Using Web-Graph Distance for Relevance Feedback in Web Search", SIGIR 2006.

Score = N (2^{r(i)} - 1) / (1 + i)

Where N is such that the score cannot be greater than 1. We compute this by computing the DCG (unnormalized) of a perfect ranking.

double ireval::RetrievalEvaluator::normalizedDiscountedCumulativeGain  )  [inline]
 

Normalized Discounted Cumulative Gain

This measure was introduced in Jarvelin, Kekalainen, "IR Evaluation Methods for Retrieving Highly Relevant Documents" SIGIR 2001. I copied the formula from Vassilvitskii, "Using Web-Graph Distance for Relevance Feedback in Web Search", SIGIR 2006.

Score = N (2^{r(i)} - 1) / (1 + i)

Where N is such that the score cannot be greater than 1. We compute this by computing the DCG (unnormalized) of a perfect ranking.

double ireval::RetrievalEvaluator::precision int  documentsRetrieved  )  [inline]
 

Returns the precision of the retrieval at a given number of documents retrieved. The precision is the number of relevant documents retrieved divided by the total number of documents retrieved.

Parameters:
documentsRetrieved The evaluation rank.

double [] ireval::RetrievalEvaluator::precisionAtFixedPoints  )  [inline]
 

String ireval::RetrievalEvaluator::queryName  )  [inline]
 

Returns the name of the query represented by this evaluator.

double ireval::RetrievalEvaluator::recall int  documentsRetrieved  )  [inline]
 

Returns the recall of the retrieval at a given number of documents retrieved. The recall is the number of relevant documents retrieved divided by the total number of relevant documents for the query.

Parameters:
documentsRetrieved The evaluation rank.

double ireval::RetrievalEvaluator::reciprocalRank  )  [inline]
 

Returns the reciprocal of the rank of the first relevant document retrieved, or zero if no relevant documents were retrieved.

ArrayList<Document> ireval::RetrievalEvaluator::relevantDocuments  )  [inline]
 

Returns a list of all documents judged relevant, whether they were retrieved or not. Documents are listed in the order they were retrieved, with those not retrieved coming last.

ArrayList<Document> ireval::RetrievalEvaluator::relevantMissedDocuments  )  [inline]
 

Returns a list of documents that were judged relevant that were not retrieved.

int ireval::RetrievalEvaluator::relevantRetrieved int  documentsRetrieved  )  [inline]
 

The number of relevant documents retrieved at a particular rank. This is equivalent to n * precision(n).

ArrayList<Document> ireval::RetrievalEvaluator::relevantRetrievedDocuments  )  [inline]
 

Returns a list of retrieved documents that were judged relevant, in the order that they were retrieved.

ArrayList<Document> ireval::RetrievalEvaluator::retrievedDocuments  )  [inline]
 

Returns:
The list of retrieved documents.

double ireval::RetrievalEvaluator::rPrecision  )  [inline]
 

Returns the precision at the rank equal to the total number of relevant documents retrieved. This method is equivalent to precision( relevantDocuments().size() ). If R is greater than the number of documents retrieved, the non-retrieved documents are assumed to be non-relevant (cf trec_eval 8).


Member Data Documentation

double [] ireval::RetrievalEvaluator::_ip = null [private]
 

ArrayList<Document> ireval::RetrievalEvaluator::_irrelevantRetrieved [private]
 

ArrayList<Document> ireval::RetrievalEvaluator::_judgedIrrelevantRetrieved [private]
 

ArrayList<Document> ireval::RetrievalEvaluator::_judgedMissed [private]
 

HashMap<String, Judgment> ireval::RetrievalEvaluator::_judgments [private]
 

int ireval::RetrievalEvaluator::_numIrrelevant [private]
 

double [] ireval::RetrievalEvaluator::_pFP = null [private]
 

String ireval::RetrievalEvaluator::_queryName [private]
 

ArrayList<Document> ireval::RetrievalEvaluator::_relevant [private]
 

ArrayList<Document> ireval::RetrievalEvaluator::_relevantMissed [private]
 

ArrayList<Document> ireval::RetrievalEvaluator::_relevantRetrieved [private]
 

ArrayList<Document> ireval::RetrievalEvaluator::_retrieved [private]
 

int [] ireval::RetrievalEvaluator::fixedPoints = { 5, 10, 15, 20, 30, 100, 200, 500, 1000 } [static, private]
 


The documentation for this class was generated from the following file:
Generated on Tue Jun 15 11:03:04 2010 for Lemur by doxygen 1.3.4