Anserini + PyTerrier ===================================== `Anserini `__ is a retrieval toolkit built on top of `Lucene `__. ``pyterrier-anserini`` provides a `PyTerrier `__-compatible interface to Anserini, allowing you to easily run experiments and combine it with other systems. .. BEGIN_README_SKIP .. toctree:: :maxdepth: 1 Extras API Documentation .. END_README_SKIP Quick Start ------------------------------------- You can install ``pyterrier-anserini`` with pip: .. code-block:: console :caption: Install ``pyterrier-anserini`` $ pip install pyterrier-anserini :class:`~pyterrier_anserini.AnseriniIndex` is the main class for working with Anserini. For instance, you can download a pre-built index from HuggingFace and retrieve with BM25 using the following snippet: .. code-block:: python :caption: Load an Anserini index from HuggingFace and retrieve using BM25 >>> from pyterrier_anserini import AnseriniIndex >>> index = AnseriniIndex.from_hf('macavaney/msmarco-passage.anserini') >>> bm25 = index.bm25(include_fields=['contents']) >>> bm25.search('terrier breeds') qid query docno score rank contents 0 1 terrier breeds 5785957 11.9588 0 The Jack Russell Terrier and the Russell ... 1 1 terrier breeds 7455374 11.9343 1 FCI, ANKC, and IKC recognize the shorts a... 2 1 terrier breeds 1406578 11.8640 2 Norfolk terrier (English breed of small t... 3 1 terrier breeds 3984886 11.7518 3 Terrier Group is the name of a breed Grou... 4 1 terrier breeds 7728131 11.5660 4 The Yorkshire Terrier didn't begin as the... ... Acknowledgements ------------------------------------- This extension uses the Anserini package. If you use it, please be sure to cite Anserini: .. cite.dblp:: conf/sigir/Yang0L17 This extension was built as part of the PyTerrier project: .. cite.dblp:: conf/cikm/MacdonaldTMO21 This extension was written by `Sean MacAvaney `__ at the University of Glasgow and was based on an original implementation that was part of PyTerrier, written by `Craig Macdonald `__. Check out the GitHub for `a full list of contributors `__.