Structured Query Evaluation

This application runs retrieval experiments to evaluate the performance of the structured query model using the inquery retrieval method. StructQueryEval requires that its index parameter be a positional index (KeyfileIncIndex).

Feedback is implemented as a WSUM of the original query combined with terms selected from the feedback documents based on belief score. The expanded query has the form:


#wsum( (1 - a) <original query>
      a*w1  t1
      a*w2  t2
      ...
      a*wN  tN
      )

where a is the value of the parameter feedbackPosCoeff.

Scoring is either done over a working set of documents (essentially re-ranking), or over the whole collection. This is indicated by the parameter "useWorkingSet". When "useWorkingSet" has either a non-zero (integer) value or the value true, scoring will be on a working set specified in a file given by "workingSetFile". The file should have three columns. The first is the query id; the second the document id; and the last a numerical value, which is ignored. By default, scoring is on the whole collection.

The parameters are:

index: The complete name of the index table-of-content file for the database index. This must be a positional index (currently KeyfileIncIndex).
textQuery: the query text stream parsed by ParseInQuery
resultFile: the result file
resultFormat: whether the result format should be of the TREC format (i.e., six-column) or just a simple three-column format <queryID, docID, score&gt. String value, either trec for TREC format or 3col for three column format. The integer values, zero for non-TREC format, and non-zero for TREC format used in previous versions of lemur are accepted. Default: TREC format.
resultCount: the number of documents to return as result for each query
defaultBelief: The default belief for a document: Default=0.4
feedbackDocCount: the number of docs to use for pseudo-feedback (0 means no-feedback)
feedbackTermCount: the number of terms to add to a query when doing feedback.
feedbackPosCoeff: the coefficient for positive terms in the expanded query.

Generated on Tue Jun 15 11:02:58 2010 for Lemur by

1.3.4