#include <FileClassEnvironmentFactory.hpp>
Public Attributes | |
std::string | name |
name of this file class, eg trecweb | |
std::string | parser |
document parser for this file class | |
std::string | tokenizer |
document tokenizer for this file class | |
std::string | iterator |
document iterator for this file class | |
std::string | startDocTag |
tag indicating start of a document | |
std::string | endDocTag |
tag indicating the end of a document | |
std::string | endMetadataTag |
tag indicating the end of the metadata fields | |
std::vector< std::string > | include |
tags whose contents should be included in the index. If empty, all tags are included. | |
std::vector< std::string > | exclude |
tags whose contents should be excluded from the index | |
std::vector< std::string > | index |
tags that should be forwarded to the index for tag extents, ie named fields. | |
std::vector< std::string > | metadata |
tags whose contents should be indexed as metadata | |
std::map< indri::parse::ConflationPattern *, std::string > | conflations |
tags that should be conflated. The map is the of the form tag => conflated tag, eg h1 => heading. |
|
tags that should be conflated. The map is the of the form tag => conflated tag, eg h1 => heading.
|
|
tag indicating the end of a document
|
|
tag indicating the end of the metadata fields
|
|
tags whose contents should be excluded from the index
|
|
tags whose contents should be included in the index. If empty, all tags are included.
|
|
tags that should be forwarded to the index for tag extents, ie named fields.
|
|
document iterator for this file class
|
|
tags whose contents should be indexed as metadata
|
|
name of this file class, eg trecweb
|
|
document parser for this file class
|
|
tag indicating start of a document
|
|
document tokenizer for this file class
|