SuitEval Suites
=========================================

BEIR
-----------------------------------------

BEIR is a heterogeneous benchmark containing diverse IR tasks.

.. cite.dblp:: journals/corr/abs-2104-08663

Usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python

   from suiteeval.suite import BEIR
   results = BEIR(pipelines)

NanoBEIR
-----------------------------------------

Compact BEIR subset for faster iteration.

Usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python

   from suiteeval.suite import NanoBEIR
   results = NanoBEIR(pipelines)

LoTTE
-----------------------------------------

LoTTE (Long-Tail Topic-stratified Evaluation) is a set of test collections focused on out-of-domain evaluation. 

.. cite.dblp:: conf/naacl/SanthanamKSPZ22

Usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python

   from suiteeval.suite import Lotte
   results = Lotte(pipelines)

BRIGHT
-----------------------------------------

BRIGHT comprises 12 diverse datasets, spanning biology, economics, robotics, math, code and more. The queries can be long StackExchange posts, math or code question. 

.. cite.dblp:: conf/iclr/SuYXSMWLSST0YA025

Usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python

   from suiteeval.suite import BRIGHT
   results = BRIGHT(pipelines)

MS MARCO (Document & Passage)
-----------------------------------------

MSMARCO is a large-scale dataset for training and evaluating information retrieval models. These suites contain TREC Deep Learning queries and relevance judgments for both document and passage retrieval tasks.

.. cite.dblp:: conf/sigir/CraswellMYCL21

Usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python

   from suiteeval.suite import MSMARCODocument, MSMARCOPassage
   doc_results = MSMARCODocument(pipelines)
   pas_results = MSMARCOPassage(pipelines)