The ClueWeb12 PageRank
The following files are the PageRank scores output created using Galago. We provide files that are in the both Galago format and Indri prior.
Note: All files have been compressed with bzip2 compression and must be decompressed prior to use.
ClueWeb12 PageRank
- pagerank.docNameOrder.bz2 (5.0G, compressed): PageRank scores for ClueWeb12 listed in document name order. This is a Galago format file. Future applications in Galago will use it.
- pagerank.scoreOrder.bz2 (4.7G, compressed): PageRank scores ClueWeb12 listed in score order. This is a Galago format file. It is suitable for input to Indri's pagerank program for conversion to an Indri prior file.
- pagerank.prior.bz2 (2.6G, compressed): PageRank priors for ClueWeb12. This is the Indri format prior file, suitable for installation in an Indri repository with the program makeprior.
ClueWeb12-B13 PageRank
- pagerank.docNameOrder.bz2 (224M, compressed): PageRank scores for ClueWeb12-B13 listed in document name order. This is a Galago format file. Future applications in Galago will use it.
- pagerank.scoreOrder.bz2 (200M, compressed): PageRank scores ClueWeb12-B13 listed in score order. This is a Galago format file. It is suitable for input to Indri's pagerank program for conversion to an Indri prior file.
- pagerank.prior.bz2 (161M, compressed): PageRank priors for ClueWeb12-B13. This is the Indri format prior file, suitable for installation in an Indri repository with the program makeprior.