lemur::api::TextQueryRetMethod Class Reference

Abstract Interface for A Retrieval Method/Model for Text Query. More...

#include <TextQueryRetMethod.hpp>

Inheritance diagram for lemur::api::TextQueryRetMethod:

List of all members.


Public Member Functions
	TextQueryRetMethod (const Index &ind, ScoreAccumulator &accumulator)
virtual	~TextQueryRetMethod ()
virtual TextQueryRep *	computeTextQueryRep (const TermQuery &qry)=0
	compute the query representation for a text query (caller responsible for deleting the memory of the generated new instance)
virtual TextQueryRep *	computeTextQueryRep (DOCID_T docid)
	compute a query rep for an existing doc
virtual QueryRep *	computeQueryRep (const Query &qry)
	overriding abstract class method
virtual double	scoreDoc (const QueryRep &qry, DOCID_T docID)
	ooverriding abstract class method
virtual void	scoreCollection (const QueryRep &qry, IndexedRealVector &results)
	overriding abstract class method with a general efficient inverted index scoring procedure
virtual void	scoreCollection (DOCID_T docid, IndexedRealVector &results)
	add support for scoring an existing document against the collection
virtual DocumentRep *	computeDocRep (DOCID_T docID)=0
	compute the doc representation (caller responsible for deleting the memory of the generated new instance)
virtual ScoreFunction *	scoreFunc ()=0
	return the scoring function pointer
virtual void	updateQuery (QueryRep &qryRep, const DocIDSet &relDocs)
	update the query
virtual void	updateTextQuery (TextQueryRep &qryRep, const DocIDSet &relDocs)=0
	Modify/update the query representation based on a set (presumably) relevant documents.
virtual void	scoreInvertedIndex (const QueryRep &qryRep, IndexedRealVector &scores, bool scoreAll=false)
	Efficient scoring with the inverted index.
virtual double	scoreDocVector (const TextQueryRep &qry, DOCID_T docID, lemur::utility::FreqVector &docVector)
virtual double	scoreDocPassages (const TermQuery &qRep, DOCID_T docID, lemur::retrieval::PassageScoreVector &scores, int psgSize, int overlap)
	Score a query for each passage of a document.
Protected Attributes
ScoreAccumulator &	scAcc
DocumentRep **	docReps
	cache document reps.
bool	cacheDocReps
	whether or not to cache document representations
int	docRepsSize
	number of documents plus 1, the size of the docReps array.

Detailed Description

Abstract Interface for A Retrieval Method/Model for Text Query.

A text query retrieval method is determined by specifying the following elements:

A method to compute the query representation
A method to compute the doc representation
The scoring function
A method to update the query representation based on a set of (relevant) documents

Given a query q =(q₁,q₂,...,q_N) and a document d=(d₁,d₂,...,d_N), where q₁,...,q_N and d₁,...,d_N are terms, TextQueryRetMethod assumes the following general scoring function:


      s(q,d) = g(w(q₁,d₁,q,d) + ... + w(q_N,d_N,q,d),q,d)

That is, the score of a document d against a query q is a function g of the accumulated weight w for each matched term.

The score is thus determined by two functions g and w; both may depend on the whole query or document. The function w gives the weight of each matched term, while the function g makes it possible to perform any further transformation of the sum of the weight of all matched terms based on the "summary" information of a query or a document (e.g., document length).

TextQueryRep, DocumentRep, and ScoreFunction are designed to support this general scoring function in the following way:

A ScoreFunction is responsible for defining the two functions g and w. A TextQueryRep provides any information required for scoring from the query side (e.g., query term frequency). Similarly, a DocumentRep provides any information required for scoring from the document side. Furthermore, a TextQueryRep supports iteration over all query terms, allowing easy accumulation of weights over matched terms. The weighting function w and score adjustment function g typically assume and depend on some particular information and representation of the query and document, so a specific ScoreFunction (for a specific retrieval method) only works for some specific TextQueryRep and DocumentRep that are appropriate for the specific retrieval method.

Constructor & Destructor Documentation

lemur::api::TextQueryRetMethod::TextQueryRetMethod ( const Index & ind,

ScoreAccumulator & accumulator

)

Create the retrieval method. If cacheDocReps is true, allocate DocumentRep cache array.

virtual lemur::api::TextQueryRetMethod::~TextQueryRetMethod ( ) [inline, virtual]

Destroy the object. If cacheDocReps is true, delete the DocumentRep cache array

Member Function Documentation

virtual DocumentRep* lemur::api::TextQueryRetMethod::computeDocRep ( DOCID_T docID ) [pure virtual]

compute the doc representation (caller responsible for deleting the memory of the generated new instance)

Implemented in lemur::retrieval::CORIRetMethod, lemur::retrieval::CosSimRetMethod, lemur::retrieval::OkapiRetMethod, lemur::retrieval::SimpleKLRetMethod, and lemur::retrieval::TFIDFRetMethod.

QueryRep * lemur::api::TextQueryRetMethod::computeQueryRep ( const Query & qry ) [inline, virtual]

overriding abstract class method

Implements lemur::api::RetrievalMethod.

virtual TextQueryRep* lemur::api::TextQueryRetMethod::computeTextQueryRep ( DOCID_T docid ) [inline, virtual]

compute a query rep for an existing doc

Reimplemented in lemur::retrieval::CosSimRetMethod.

virtual TextQueryRep* lemur::api::TextQueryRetMethod::computeTextQueryRep ( const TermQuery & qry ) [pure virtual]

compute the query representation for a text query (caller responsible for deleting the memory of the generated new instance)

void lemur::api::TextQueryRetMethod::scoreCollection ( DOCID_T docid,

IndexedRealVector & results

) [virtual]

add support for scoring an existing document against the collection

void lemur::api::TextQueryRetMethod::scoreCollection ( const QueryRep & qry,

IndexedRealVector & results

) [virtual]

overriding abstract class method with a general efficient inverted index scoring procedure

Reimplemented from lemur::api::RetrievalMethod.

double lemur::api::TextQueryRetMethod::scoreDoc ( const QueryRep & qry,

DOCID_T docID

) [virtual]

ooverriding abstract class method

Implements lemur::api::RetrievalMethod.

double lemur::api::TextQueryRetMethod::scoreDocPassages ( const TermQuery & qRep,

DOCID_T docID,

lemur::retrieval::PassageScoreVector & scores,

int psgSize,

int overlap

) [virtual]

Score a query for each passage of a document.

Parameters:

qRep the TextQuery to score.

docID the document to score.

scores accumulator for the passage scores, in passage order.

psgSize the number of tokens for sliding window.

overlap the number of tokens to overlap in each passage.

Returns:
the maximum score over the passages.

double lemur::api::TextQueryRetMethod::scoreDocVector ( const TextQueryRep & qry,

DOCID_T docID,

lemur::utility::FreqVector & docVector

) [virtual]

virtual ScoreFunction* lemur::api::TextQueryRetMethod::scoreFunc ( ) [pure virtual]

return the scoring function pointer

Implemented in lemur::retrieval::CORIRetMethod, lemur::retrieval::CosSimRetMethod, lemur::retrieval::OkapiRetMethod, lemur::retrieval::SimpleKLRetMethod, and lemur::retrieval::TFIDFRetMethod.

void lemur::api::TextQueryRetMethod::scoreInvertedIndex ( const QueryRep & qryRep,

IndexedRealVector & scores,

bool scoreAll = false

) [virtual]

Efficient scoring with the inverted index.
a general scoring procedure shared by many different models (assuming "sortedScores has memory allocated)

virtual void lemur::api::TextQueryRetMethod::updateQuery ( QueryRep & qryRep,

const DocIDSet & relDocs

) [inline, virtual]

update the query

Implements lemur::api::RetrievalMethod.

virtual void lemur::api::TextQueryRetMethod::updateTextQuery ( TextQueryRep & qryRep,

const DocIDSet & relDocs

) [pure virtual]

Modify/update the query representation based on a set (presumably) relevant documents.

Member Data Documentation

bool lemur::api::TextQueryRetMethod::cacheDocReps [protected]

whether or not to cache document representations

DocumentRep** lemur::api::TextQueryRetMethod::docReps [protected]

cache document reps.

int lemur::api::TextQueryRetMethod::docRepsSize [protected]

number of documents plus 1, the size of the docReps array.

ScoreAccumulator& lemur::api::TextQueryRetMethod::scAcc [protected]

The documentation for this class was generated from the following files:

Generated on Tue Jun 15 11:03:05 2010 for Lemur by

1.3.4

lemur::api::TextQueryRetMethod Class Reference

Public Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation