Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

lemur::api::TextQueryRetMethod Class Reference

Abstract Interface for A Retrieval Method/Model for Text Query. More...

#include <TextQueryRetMethod.hpp>

Inheritance diagram for lemur::api::TextQueryRetMethod:

lemur::api::RetrievalMethod lemur::retrieval::CORIRetMethod lemur::retrieval::CosSimRetMethod lemur::retrieval::OkapiRetMethod lemur::retrieval::SimpleKLRetMethod lemur::retrieval::TFIDFRetMethod List of all members.

Public Member Functions

 TextQueryRetMethod (const Index &ind, ScoreAccumulator &accumulator)
virtual ~TextQueryRetMethod ()
virtual TextQueryRepcomputeTextQueryRep (const TermQuery &qry)=0
 compute the query representation for a text query (caller responsible for deleting the memory of the generated new instance)

virtual TextQueryRepcomputeTextQueryRep (DOCID_T docid)
 compute a query rep for an existing doc

virtual QueryRepcomputeQueryRep (const Query &qry)
 overriding abstract class method

virtual double scoreDoc (const QueryRep &qry, DOCID_T docID)
 ooverriding abstract class method

virtual void scoreCollection (const QueryRep &qry, IndexedRealVector &results)
 overriding abstract class method with a general efficient inverted index scoring procedure

virtual void scoreCollection (DOCID_T docid, IndexedRealVector &results)
 add support for scoring an existing document against the collection

virtual DocumentRepcomputeDocRep (DOCID_T docID)=0
 compute the doc representation (caller responsible for deleting the memory of the generated new instance)

virtual ScoreFunctionscoreFunc ()=0
 return the scoring function pointer

virtual void updateQuery (QueryRep &qryRep, const DocIDSet &relDocs)
 update the query

virtual void updateTextQuery (TextQueryRep &qryRep, const DocIDSet &relDocs)=0
 Modify/update the query representation based on a set (presumably) relevant documents.

virtual void scoreInvertedIndex (const QueryRep &qryRep, IndexedRealVector &scores, bool scoreAll=false)
 Efficient scoring with the inverted index.

virtual double scoreDocVector (const TextQueryRep &qry, DOCID_T docID, lemur::utility::FreqVector &docVector)
virtual double scoreDocPassages (const TermQuery &qRep, DOCID_T docID, lemur::retrieval::PassageScoreVector &scores, int psgSize, int overlap)
 Score a query for each passage of a document.


Protected Attributes

ScoreAccumulatorscAcc
DocumentRep ** docReps
 cache document reps.

bool cacheDocReps
 whether or not to cache document representations

int docRepsSize
 number of documents plus 1, the size of the docReps array.


Detailed Description

Abstract Interface for A Retrieval Method/Model for Text Query.

A text query retrieval method is determined by specifying the following elements:

Given a query q =(q1,q2,...,qN) and a document d=(d1,d2,...,dN), where q1,...,qN and d1,...,dN are terms, TextQueryRetMethod assumes the following general scoring function:

s(q,d) = g(w(q1,d1,q,d) + ... + w(qN,dN,q,d),q,d)

That is, the score of a document d against a query q is a function g of the accumulated weight w for each matched term.

The score is thus determined by two functions g and w; both may depend on the whole query or document. The function w gives the weight of each matched term, while the function g makes it possible to perform any further transformation of the sum of the weight of all matched terms based on the "summary" information of a query or a document (e.g., document length).

TextQueryRep, DocumentRep, and ScoreFunction are designed to support this general scoring function in the following way:

A ScoreFunction is responsible for defining the two functions g and w. A TextQueryRep provides any information required for scoring from the query side (e.g., query term frequency). Similarly, a DocumentRep provides any information required for scoring from the document side. Furthermore, a TextQueryRep supports iteration over all query terms, allowing easy accumulation of weights over matched terms. The weighting function w and score adjustment function g typically assume and depend on some particular information and representation of the query and document, so a specific ScoreFunction (for a specific retrieval method) only works for some specific TextQueryRep and DocumentRep that are appropriate for the specific retrieval method.


Constructor & Destructor Documentation

lemur::api::TextQueryRetMethod::TextQueryRetMethod const Index ind,
ScoreAccumulator accumulator
 

Create the retrieval method. If cacheDocReps is true, allocate DocumentRep cache array.

virtual lemur::api::TextQueryRetMethod::~TextQueryRetMethod  )  [inline, virtual]
 

Destroy the object. If cacheDocReps is true, delete the DocumentRep cache array


Member Function Documentation

virtual DocumentRep* lemur::api::TextQueryRetMethod::computeDocRep DOCID_T  docID  )  [pure virtual]
 

compute the doc representation (caller responsible for deleting the memory of the generated new instance)

Implemented in lemur::retrieval::CORIRetMethod, lemur::retrieval::CosSimRetMethod, lemur::retrieval::OkapiRetMethod, lemur::retrieval::SimpleKLRetMethod, and lemur::retrieval::TFIDFRetMethod.

QueryRep * lemur::api::TextQueryRetMethod::computeQueryRep const Query qry  )  [inline, virtual]
 

overriding abstract class method

Implements lemur::api::RetrievalMethod.

virtual TextQueryRep* lemur::api::TextQueryRetMethod::computeTextQueryRep DOCID_T  docid  )  [inline, virtual]
 

compute a query rep for an existing doc

Reimplemented in lemur::retrieval::CosSimRetMethod.

virtual TextQueryRep* lemur::api::TextQueryRetMethod::computeTextQueryRep const TermQuery qry  )  [pure virtual]
 

compute the query representation for a text query (caller responsible for deleting the memory of the generated new instance)

void lemur::api::TextQueryRetMethod::scoreCollection DOCID_T  docid,
IndexedRealVector results
[virtual]
 

add support for scoring an existing document against the collection

void lemur::api::TextQueryRetMethod::scoreCollection const QueryRep qry,
IndexedRealVector results
[virtual]
 

overriding abstract class method with a general efficient inverted index scoring procedure

Reimplemented from lemur::api::RetrievalMethod.

double lemur::api::TextQueryRetMethod::scoreDoc const QueryRep qry,
DOCID_T  docID
[virtual]
 

ooverriding abstract class method

Implements lemur::api::RetrievalMethod.

double lemur::api::TextQueryRetMethod::scoreDocPassages const TermQuery qRep,
DOCID_T  docID,
lemur::retrieval::PassageScoreVector scores,
int  psgSize,
int  overlap
[virtual]
 

Score a query for each passage of a document.

Parameters:
qRep the TextQuery to score.
docID the document to score.
scores accumulator for the passage scores, in passage order.
psgSize the number of tokens for sliding window.
overlap the number of tokens to overlap in each passage.
Returns:
the maximum score over the passages.

double lemur::api::TextQueryRetMethod::scoreDocVector const TextQueryRep qry,
DOCID_T  docID,
lemur::utility::FreqVector docVector
[virtual]
 

virtual ScoreFunction* lemur::api::TextQueryRetMethod::scoreFunc  )  [pure virtual]
 

return the scoring function pointer

Implemented in lemur::retrieval::CORIRetMethod, lemur::retrieval::CosSimRetMethod, lemur::retrieval::OkapiRetMethod, lemur::retrieval::SimpleKLRetMethod, and lemur::retrieval::TFIDFRetMethod.

void lemur::api::TextQueryRetMethod::scoreInvertedIndex const QueryRep qryRep,
IndexedRealVector scores,
bool  scoreAll = false
[virtual]
 

Efficient scoring with the inverted index.

a general scoring procedure shared by many different models (assuming "sortedScores has memory allocated)

virtual void lemur::api::TextQueryRetMethod::updateQuery QueryRep qryRep,
const DocIDSet relDocs
[inline, virtual]
 

update the query

Implements lemur::api::RetrievalMethod.

virtual void lemur::api::TextQueryRetMethod::updateTextQuery TextQueryRep qryRep,
const DocIDSet relDocs
[pure virtual]
 

Modify/update the query representation based on a set (presumably) relevant documents.


Member Data Documentation

bool lemur::api::TextQueryRetMethod::cacheDocReps [protected]
 

whether or not to cache document representations

DocumentRep** lemur::api::TextQueryRetMethod::docReps [protected]
 

cache document reps.

int lemur::api::TextQueryRetMethod::docRepsSize [protected]
 

number of documents plus 1, the size of the docReps array.

ScoreAccumulator& lemur::api::TextQueryRetMethod::scAcc [protected]
 


The documentation for this class was generated from the following files:
Generated on Tue Jun 15 11:03:05 2010 for Lemur by doxygen 1.3.4