-memory=100M on the command line. -corpus.path=/path/to/file_or_directory on the command line. -corpus.class=trecweb on the command line. The known classes are: path. Specified as <corpus><annotations>/path/to/file</annotations></corpus> in the parameter file and as -corpus.annotations=/path/to/file on the command line. path. Specified as <corpus><metadata>/path/to/file</metadata></corpus> in the parameter file and as -corpus.metadata=/path/to/file on the command line.
Combining the first two of these elements, the parameter file would contain:
<corpus>
<path>/path/to/file_or_directory</path>
<class>trecweb</class>
</corpus>
field -- Make the named field available for retrieval as metadata. Specified as <metadata><field>fieldname</field></metadata> in the parameter file and as metadata.field=fieldname on the command line.
forward -- Make the named field available for retrieval as metadata and build a lookup table to make retrieving the value more efficient. Specified as <metadata><forward>fieldname</forward></metadata> in the parameter file and as metadata.forward=fieldname on the command line. The external document id field "docno" is automatically added as a forward metadata field.
backward -- Make the named field available for retrieval as metadata and build a lookup table for inverse lookup of documents based on the value of the field. Specified as <metadata><backward>fieldname</backward></metadata> in the parameter file and as metadata.backward=fieldname on the command line. The external document id field "docno" is automatically added as a backward metadata field.
-field.name=fieldname on the command line. true if the field contains numeric data, otherwise the symbol false, specified as <field><numeric>true</numeric></field> in the parameter file and as -field.numeric=true on the command line. This is an optional parameter, defaulting to false. Note that 0 can be used for false and 1 can be used for true. -stemmer.name=stemmername on the command line. This is an optional parameter with the default of no stemming. true to perform case normalization when indexing, false to index with mixed case. Default true -stopper.word=stopword on the command line. This is an optional parameter with the default of no stopping.
-index=/path/to/repository on the command line. This element can be specified multiple times to combine Repositories. -server=hostname on the command line. The hostname can include an optional port number to connect to, using the form hostname:portnum. This element can be specified multiple times to combine servers. -count=number on the command line.
( key ":" value ) [ "," key ":" value ]*
Here's an example rule in command line format:
-rule=method:linear,collectionLambda:0.2,field:title
and in parameter file format:
<rule>method:linear,collectionLambda:0.2,field:title</rule>
This corresponds to Jelinek-Mercer smoothing with background lambda equal to 0.2, only for items in a title field.
If nothing is listed for a key, all values are assumed. So, a rule that does not specify a field matches all fields. This makes -rule=method:linear,collectionLambda:0.2 a valid rule.
Valid keys:
Valid methods:
-stopper.word=stopword on the command line. This is an optional parameter with the default of no stopping.
Format of the parameter value:
(tfidf|okapi) [ "," key ":" value ]*
Here's an example rule in command line format:
-baseline=tfidf,k1:1.0,b:0.3
and in parameter file format:
<baseline>tfidf,k1:1.0,b:0.3</baseline>
Methods:
Parameters (optional):
Parameters (optional):
-queryOffset=number on the command line. -runID=someID on the command line. true to produce TREC scorable output, otherwise the symbol false. Specified as <trecFormat>true</trecFormat> in the parameter file and as -trecFormat=true on the command line. Note that 0 can be used for false, and 1 can be used for true.
-fbDocs=number on the command line. -fbTerms=number on the command line. -fbMu=number on the command line. -fbOrigWeight=number on the command line.
-memory=100M on the command line. -index=/path/to/repository on the command line. -port=number on the command line.
1.3.4