Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

lemur::retrieval::XLingRetMethod Class Reference

Cross lingual retrieval method. More...

#include <XLingRetMethod.hpp>

Inheritance diagram for lemur::retrieval::XLingRetMethod:

lemur::api::RetrievalMethod List of all members.

Public Member Functions

 XLingRetMethod (const lemur::api::Index &dbIndex, const lemur::api::Index &background, lemur::dictionary::PDict &dict, lemur::api::ScoreAccumulator &accumulator, double l, double b, bool cacheDR, string &sBM, string &tBM, const lemur::api::Stopper *stp=NULL, lemur::api::Stemmer *stm=NULL)
 Constructor.

virtual ~XLingRetMethod ()
 clean up.

virtual lemur::api::DocumentRepcomputeDocRep (lemur::api::DOCID_T docID)
 Create a document representation.

virtual double matchedTermWeight (lemur::api::TERMID_T id, double weight, const lemur::api::DocInfo *info, const lemur::api::DocumentRep *dRep) const
 Score a given term for a given document.

virtual double adjustedScore (double origScore, double pge) const
 Adjust the score for a given document.

virtual void scoreCollection (const lemur::api::QueryRep &qry, lemur::api::IndexedRealVector &results)
virtual void scoreInvertedIndex (const lemur::api::QueryRep &qryRep, lemur::api::IndexedRealVector &scores, bool scoreAll=false)
virtual lemur::api::QueryRepcomputeQueryRep (const lemur::api::Query &qry)
virtual lemur::api::QueryRepcomputeTargetKLRep (const lemur::api::QueryRep *qry)
virtual double scoreDoc (const lemur::api::QueryRep &qry, lemur::api::DOCID_T docID)
 Score a document identified by the id w.r.t. a query rep.

virtual void updateQuery (lemur::api::QueryRep &qryRep, const lemur::api::DocIDSet &relDocs)
 update the query -- noop


Protected Member Functions

virtual double scoreDocVector (const XLingQueryModel &qRep, lemur::api::DOCID_T docID, lemur::utility::FreqVector &docVector)

Protected Attributes

double lambda
double beta
double numSource
double numTarget
bool docBasedSourceSmooth
bool docBasedTargetSmooth
lemur::api::ScoreAccumulatorscAcc
lemur::dictionary::PDictdictionary
lemur::api::Stemmerstemmer
const lemur::api::Stopperstopper
const lemur::api::Indexsource
lemur::api::DocumentRep ** docReps
 cache document reps.

bool cacheDocReps
 whether or not to cache document representations

int docRepsSize
 number of documents plus 1, the size of the docReps array.

lemur::api::ScoreAccumulatortermScores

Detailed Description

Cross lingual retrieval method.

Translation dictionary based retrieval, scoring queries in the source language against documents in the target language using:
P(Q_s|D_t) = Prod_w_in_Q_s(lambda(Sum_t_in_D_t P(t|D_t)P(w|t) + (1-lambda)P(w|G_s)
where G_s is the background model for the source language.


Constructor & Destructor Documentation

lemur::retrieval::XLingRetMethod::XLingRetMethod const lemur::api::Index dbIndex,
const lemur::api::Index background,
lemur::dictionary::PDict dict,
lemur::api::ScoreAccumulator accumulator,
double  l,
double  b,
bool  cacheDR,
string &  sBM,
string &  tBM,
const lemur::api::Stopper stp = NULL,
lemur::api::Stemmer stm = NULL
 

Constructor.

Parameters:
dbIndex index for target language documents
background index for source language background model
dict PDict containing source->target translation probabilities
accumulator ScoreAccumulator for intermediate results
l lambda value to use for smoothing background model
b beta value to use for smoothing P(t|D)
cacheDR whether or not to cache document reps
sBM whether to use term frequency (tf/|V|) or term doc frequency (docCount(t)/Sum_w_in_V(docCount(w))) for the source language background model. Default is term frequency.
tBM whether to use term frequency (tf/|V|) or term doc frequency (docCount(t)/Sum_w_in_V(docCount(w))) for the targetlanguage background model. Default is term frequency.
stp source language Stopper to use when getting translations.
stm source language Stemmer to use when getting translations.

lemur::retrieval::XLingRetMethod::~XLingRetMethod  )  [virtual]
 

clean up.


Member Function Documentation

virtual double lemur::retrieval::XLingRetMethod::adjustedScore double  origScore,
double  pge
const [inline, virtual]
 

Adjust the score for a given document.

Parameters:
origScore the original score
pge the background probability to adjust by.
Returns:
log((lambda * origScore) + ((1 - lambda) * pge))

lemur::api::DocumentRep * lemur::retrieval::XLingRetMethod::computeDocRep lemur::api::DOCID_T  docID  )  [virtual]
 

Create a document representation.

Parameters:
docID the internal document id to create the representation for
Returns:
An instance of XLingDocRep

virtual lemur::api::QueryRep* lemur::retrieval::XLingRetMethod::computeQueryRep const lemur::api::Query qry  )  [inline, virtual]
 

virtual lemur::api::QueryRep* lemur::retrieval::XLingRetMethod::computeTargetKLRep const lemur::api::QueryRep qry  )  [virtual]
 

virtual double lemur::retrieval::XLingRetMethod::matchedTermWeight lemur::api::TERMID_T  id,
double  weight,
const lemur::api::DocInfo info,
const lemur::api::DocumentRep dRep
const [inline, virtual]
 

Score a given term for a given document.

Parameters:
id the term id
weight the weight for this term
info the DocInfo for this document
dRep the DocumentRep for this document
Returns:
P(t|D) * P(s|t)

virtual void lemur::retrieval::XLingRetMethod::scoreCollection const lemur::api::QueryRep qry,
lemur::api::IndexedRealVector results
[inline, virtual]
 

virtual double lemur::retrieval::XLingRetMethod::scoreDoc const lemur::api::QueryRep qry,
lemur::api::DOCID_T  docID
[virtual]
 

Score a document identified by the id w.r.t. a query rep.

double lemur::retrieval::XLingRetMethod::scoreDocVector const XLingQueryModel qRep,
lemur::api::DOCID_T  docID,
lemur::utility::FreqVector docVector
[protected, virtual]
 

virtual void lemur::retrieval::XLingRetMethod::scoreInvertedIndex const lemur::api::QueryRep qryRep,
lemur::api::IndexedRealVector scores,
bool  scoreAll = false
[virtual]
 

virtual void lemur::retrieval::XLingRetMethod::updateQuery lemur::api::QueryRep qryRep,
const lemur::api::DocIDSet relDocs
[inline, virtual]
 

update the query -- noop


Member Data Documentation

double lemur::retrieval::XLingRetMethod::beta [protected]
 

bool lemur::retrieval::XLingRetMethod::cacheDocReps [protected]
 

whether or not to cache document representations

lemur::dictionary::PDict& lemur::retrieval::XLingRetMethod::dictionary [protected]
 

bool lemur::retrieval::XLingRetMethod::docBasedSourceSmooth [protected]
 

bool lemur::retrieval::XLingRetMethod::docBasedTargetSmooth [protected]
 

lemur::api::DocumentRep** lemur::retrieval::XLingRetMethod::docReps [protected]
 

cache document reps.

int lemur::retrieval::XLingRetMethod::docRepsSize [protected]
 

number of documents plus 1, the size of the docReps array.

double lemur::retrieval::XLingRetMethod::lambda [protected]
 

double lemur::retrieval::XLingRetMethod::numSource [protected]
 

double lemur::retrieval::XLingRetMethod::numTarget [protected]
 

lemur::api::ScoreAccumulator& lemur::retrieval::XLingRetMethod::scAcc [protected]
 

const lemur::api::Index& lemur::retrieval::XLingRetMethod::source [protected]
 

lemur::api::Stemmer* lemur::retrieval::XLingRetMethod::stemmer [protected]
 

const lemur::api::Stopper* lemur::retrieval::XLingRetMethod::stopper [protected]
 

lemur::api::ScoreAccumulator* lemur::retrieval::XLingRetMethod::termScores [protected]
 


The documentation for this class was generated from the following files:
Generated on Tue Jun 15 11:03:07 2010 for Lemur by doxygen 1.3.4