Extra Anserini Features¶
Anserini-hosted Pre-Built Indexes¶
Anserini hosts a variety of pre-built indexes.
The pyterrier-anserini
package supports accessing these through Artifact.from_url()
by using the "anserini:"
URL prefix. For instance, to load the msmarco-v1-passage
index from Anserini, run:
>>> index = AnseriniIndex.from_url("anserini:msmarco-v1-passage")
Downloading index at https://rgw.cs.uwaterloo.ca/pyserini/indexes/lucene/lucene-inverted.msmarco-v1-passage.20221004.252b5e.tar.gz...
You can find a list of available indexes here.
Note that you can also load indexes from HuggingFace and share ones you’ve built through the Artifact API:
>>> index = AnseriniIndex.from_hf('macavaney/msmarco-passage.anserini')
>>> my_index.to_hf('username/my_index.anserini')