Co-occurrence Analysis

The Co-Occurrence tab is for finding entities which co-occur most frequently with a search query.

Overview of Co-occurrence input


  1. Select an index from the Indexes pane, for example: reuters.

  2. Select the Co-occurrence tab in the right content tab pane.

  3. Enter a query, for example: reagan.

  4. Select a Result Type, for example: noun-phrase.

  5. Enter a Search Depth, for example: 20000.

  6. Enter a Minimum Frequency for result types, for example: 5.

  7. Enter a maximum Number of Results, for example: 1000.

  8. Select a metric to Sort By, for example: Phi-Square.

  9. Press Submit. This analysis will take some time, depending upon your computer and the size of the dataset. A progress indicator will display as Sifaka calculates which entities co-occur with the query. Note: If this experiment takes longer than 10 minutes, check that the correct java version is installed.

  10. To sort by a different co-occurrence metric, click on the Term Freq or PMI column headers. Note that the list still contains the original set of results - just sorted in a different order.

    Example:  Co-occurrence screen
  11. To learn more about the "securities offerings" result, right click on "securities offerings" in the result table and select search from the context menu.

  12. The search tab will open with "securities offerings" automatically populated as the search query.

  13. Click on any link to open a document and learn more about the "securities offerings".

  14. Click on the Co-occurrence tab to navigate back to the results.

  15. Click the Save Results button under the results table to save the results table in CSV format. Note: The data will be saved in it's original sort order.