Word Entity Duet: Indexing Entities
The Word Entity Duet project provides an entity indexing application, which can store entities from the Freebase API
data dumps in an Elasticsearch index.
Entity Indexing Steps
- Increase the heap space used by Elasticsearch. In elasticsearch-6.1.2/config/jvm.options, set -Xmx and -Xms to at least 2G
(preferably 4g - 16g if possible.)
- Start Elasticsearch.
- Download the Freebase API data dumps and these
Freebase/Wikidata Mappings.
- Create a properties file for indexing.
- host.name= Value are localhost, host IPaddress, or hostname
- host.port= Value of host port (Elasticsearch defaults to port 9200)
- host.schema= Host schema (default http)
- index.name= The name of the index to be created (the index will be created if it does not exist or added to if it does.)
- data.directory= The directory which stores the data dumps
- wiki2freebase.filename= The filename of the Freebase/Wikidata Mappings
- Start indexing with this command: java -jar -Xmx4G freebaseentityindexer.jar indexentities.properties. Use at least 2G of
heap space (preferably 4G - 8G).