Sifaka

Sifaka is a text mining application built on top of an open-source search engine. Sifaka stores documents using multiple types of text representations (e.g., terms, bigrams, trigrams, noun phrases, named entities) and that may have optional category labels. Sifaka supports typical full-text search capabilities, saved sets, frequency analysis, co-occurrence analysis, and export of feature vectors compatible with Weka.

Features

Full-text search
Saved sets of documents
Frequency analysis
Co-occurrence analysis
Export of feature vectors compatible with Weka

Download

Sifaka can be obtained from the SourceForge Lemur Project Page.

Release History

The first version of Sifaka was released in December 2016. Release notes for the current release can be found on SourceForge.

Sifaka

Features

Download

Release History

Tutorial Links