Sifaka is a text mining application built on top of an open-source search engine. Sifaka stores documents using multiple types of text representations (e.g., terms, bigrams, trigrams, noun phrases, named entities) and that may have optional category labels. Sifaka supports typical full-text search capabilities, saved sets, frequency analysis, co-occurrence analysis, and export of feature vectors compatible with Weka.



Sifaka can be obtained from the SourceForge Lemur Project Page.

Release History

The first version of Sifaka was released in December 2016. Release notes for the current release can be found on SourceForge.

Tutorial Links