Importing Datasets

The datasets module allows easy access to existing standard test collections, particulary those from TREC. In particular, each defined dataset can download and provide easy access to:

  • files containing the documents of the corpus

  • topics (queries), as a dataframe, ready for retrieval

  • relevance assessments (aka, labels or qrels), as a dataframe, ready for evaluation

  • ready-made Terrier indices, where appropriate

pyterrier.datasets.list_datasets()[source]

Returns a dataframe of all datasets, listing which topics, qrels, corpus files or indices are available. By default, filters to only datasets with both a corpus and topics in English.

pyterrier.datasets.find_datasets()[source]

A grep-like method to help identify datasets. Filters the output of list_datasets() based on the name containing the query

pyterrier.datasets.get_dataset()[source]

Get a dataset by name

class pyterrier.datasets.Dataset[source]

Represents a dataset (test collection) for indexing or retrieval. A common use-case is to use the Dataset within an Experiment:

dataset = pt.get_dataset("trec-robust-2004")
pt.Experiment([br1, br2], dataset.get_topics(), dataset.get_qrels(), eval_metrics=["map", "recip_rank"])
get_corpus()[source]

Returns the location of the files to allow indexing the corpus, i.e. it returns a list of filenames.

get_corpus_iter(verbose=True) Iterator[Dict[str, Any]][source]

Returns an iter of dicts for this collection. If verbose=True, a tqdm pbar shows the progress over this iterator.

get_corpus_lang() Optional[str][source]

Returns the ISO 639-1 language code for the corpus, or None for multiple/other/unknown

get_index(variant=None, **kwargs)[source]

Returns the IndexRef of the index to allow retrieval. Only a few datasets provide indices ready made.

get_topics(variant=None) pandas.core.frame.DataFrame[source]

Returns the topics, as a dataframe, ready for retrieval.

get_topics_lang() Optional[str][source]

Returns the ISO 639-1 language code for the topics, or None for multiple/other/unknown

get_qrels(variant=None) pandas.core.frame.DataFrame[source]

Returns the qrels, as a dataframe, ready for evaluation.

get_topicsqrels(variant=None) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

Returns both the topics and qrels in a tuple. This is useful for pt.Experiment().

info_url()[source]

Returns a url that provides more information about this dataset.

Examples

Many of the PyTerrier unit tests are based on the Vaswani NPL test collection, a corpus of scientific abstract from ~11,000 documents. PyTerrier provides a ready-made index on the Terrier Data Repository. This allows experiments to be easily conducted:

dataset = pt.get_dataset("vaswani")
bm25 = pt.BatchRetrieve.from_dataset(dataset, "terrier_stemmed", wmodel="BM25")
dph = pt.BatchRetrieve.from_dataset(dataset, "terrier_stemmed", wmodel="DPH")
pt.Experiment(
    [bm25, dph],
    dataset.get_topics(),
    dataset.get_qrels(),
    eval_metrics=["map"]
)

Indexing and then retrieval of documents from the MSMARCO document corpus can be achieved as follows:

dataset = pt.get_dataset("trec-deep-learning-docs")
indexer = pt.TRECCollectionIndexer("./index")
# this downloads the file msmarco-docs.trec.gz
indexref = indexer.index(dataset.get_corpus())
index = pt.IndexFactory.of(indexref)

DPH_br = pt.BatchRetrieve(index, wmodel="DPH") % 100
BM25_br = pt.BatchRetrieve(index, wmodel="BM25") % 100
# this runs an experiment to obtain results on the TREC 2019 Deep Learning track queries and qrels
pt.Experiment(
    [DPH_br, BM25_br],
    dataset.get_topics("test"),
    dataset.get_qrels("test"),
    eval_metrics=["recip_rank", "ndcg_cut_10", "map"])

For more details on use of MSMARCO, see our MSMARCO leaderboard submission notebooks.

You can also index datasets that include a corpus using IterDictIndexer and get_corpus_iter:

dataset = pt.datasets.get_dataset('irds:cord19/trec-covid')
indexer = pt.index.IterDictIndexer('./cord19-index')
indexref = indexer.index(dataset.get_corpus_iter(), fields=('title', 'abstract'))
index = pt.IndexFactory.of(indexref)

DPH_br = pt.BatchRetrieve(index, wmodel="DPH") % 100
BM25_br = pt.BatchRetrieve(index, wmodel="BM25") % 100
# this runs an experiment to obtain results on the TREC COVID queries and qrels
pt.Experiment(
    [DPH_br, BM25_br],
    dataset.get_topics('title'),
    dataset.get_qrels(),
    eval_metrics=["P.5", "P.10", "ndcg_cut.10", "map"])

Available Datasets

The table below lists the provided datasets, detailing the attributes available for each dataset. In each column, True designates the presence of a single artefact of that type, while a list denotes the available variants. Datasets with the irds: prefix are from the ir_datasets package; further documentation on these datasets can be found here.

dataset

corpus

index

topics

qrels

info_url

50pct

[‘ex1’, ‘ex2’]

[training, validation]

[training, validation]

antique

True

[train, test]

[train, test]

https://ciir.cs.umass.edu/downloads/Antique/readme.txt

vaswani

True

True

True

True

http://ir.dcs.gla.ac.uk/resources/test_collections/npl/

msmarco_document

True

True

[train, dev, test, test-2020, leaderboard-2020]

[train, dev, test, test-2020]

https://microsoft.github.io/msmarco/

msmarcov2_document

True

[train, dev1, dev2, valid1, valid2, trec_2021]

[train, dev1, dev2, valid1, valid2]

https://microsoft.github.io/msmarco/TREC-Deep-Learning.html

msmarco_passage

True

True

[train, dev, dev.small, eval, eval.small, test-2019, test-2020]

[train, dev, test-2019, test-2020, dev.small]

https://microsoft.github.io/MSMARCO-Passage-Ranking/

msmarcov2_passage

True

[train, dev1, dev2, trec_2021]

[train, dev1, dev2]

https://microsoft.github.io/msmarco/TREC-Deep-Learning.html

trec-robust-2004

True

True

https://trec.nist.gov/data/t13_robust.html

trec-robust-2005

True

True

https://trec.nist.gov/data/t14_robust.html

trec-terabyte

[2004, 2005, 2006, 2004-2006, 2006-np, 2005-np]

[2004, 2005, 2006, 2004-2006, 2005-np, 2006-np]

https://trec.nist.gov/data/terabyte.html

trec-precision-medicine

[2017, 2018, 2019, 2020]

[qrels-2017-abstracts, qrels-2017-abstracts-sample, qrels-2017-trials, qrels-2018-abstracts, qrels-2018-abstracts-sample, qrels-2018-trials, qrels-2018-trials-sample, qrels-2019-abstracts, qrels-2019-trials, qrels-2019-abstracts-sample, qrels-2019-trials-sample]

https://trec.nist.gov/data/precmed.html

trec-covid

[round4, round5]

True

[round1, round2, round3, round4, round5]

[round1, round2, round3, round3-cumulative, round4, round4-cumulative, round5]

https://ir.nist.gov/covidSubmit/

trec-wt2g

True

True

https://trec.nist.gov/data/t8.web.html

trec-wt10g

[trec9, trec10-adhoc, trec10-hp]

[trec9, trec10-adhoc, trec10-hp]

https://trec.nist.gov/data/t9.web.html

trec-wt-2002

[td, np]

[np, td]

https://trec.nist.gov/data/t11.web.html

trec-wt-2003

[td, np]

[np, td]

https://trec.nist.gov/data/t11.web.html

trec-wt-2004

[all, np, hp, td]

[hp, td, np, all]

https://trec.nist.gov/data/t13.web.html

trec-wt-2009

True

[adhoc, adhoc.catA, adhoc.catB]

https://trec.nist.gov/data/web09.html

trec-wt-2010

True

[‘adhoc’]

https://trec.nist.gov/data/web10.html

trec-wt-2011

True

[‘adhoc’]

https://trec.nist.gov/data/web2011.html

trec-wt-2012

True

[‘adhoc’]

https://trec.nist.gov/data/web2012.html

irds:antique

True

https://ir-datasets.com/antique.html

irds:antique/test

True

True

True

https://ir-datasets.com/antique.html#antique/test

irds:antique/train

True

True

True

https://ir-datasets.com/antique.html#antique/train

irds:aquaint

True

https://ir-datasets.com/aquaint.html

irds:aquaint/trec-robust-2005

True

[title, description, narrative]

True

https://ir-datasets.com/aquaint.html#aquaint/trec-robust-2005

irds:argsme

https://ir-datasets.com/argsme.html

irds:argsme/1.0

True

https://ir-datasets.com/argsme.html#argsme/1.0

irds:argsme/1.0-cleaned

True

https://ir-datasets.com/argsme.html#argsme/1.0-cleaned

irds:argsme/2020-04-01/debateorg

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01/debateorg

irds:argsme/2020-04-01/debatepedia

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01/debatepedia

irds:argsme/2020-04-01/debatewise

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01/debatewise

irds:argsme/2020-04-01/idebate

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01/idebate

irds:argsme/2020-04-01/parliamentary

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01/parliamentary

irds:argsme/2020-04-01

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01

irds:beir

https://ir-datasets.com/beir.html

irds:beir/arguana

True

True

True

https://ir-datasets.com/beir.html#beir/arguana

irds:beir/climate-fever

True

True

True

https://ir-datasets.com/beir.html#beir/climate-fever

irds:beir/cqadupstack/android

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/android

irds:beir/cqadupstack/english

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/english

irds:beir/cqadupstack/gaming

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/gaming

irds:beir/cqadupstack/gis

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/gis

irds:beir/cqadupstack/mathematica

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/mathematica

irds:beir/cqadupstack/physics

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/physics

irds:beir/cqadupstack/programmers

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/programmers

irds:beir/cqadupstack/stats

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/stats

irds:beir/cqadupstack/tex

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/tex

irds:beir/cqadupstack/unix

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/unix

irds:beir/cqadupstack/webmasters

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/webmasters

irds:beir/cqadupstack/wordpress

True

[text, tags]

True

https://ir-datasets.com/beir.html#beir/cqadupstack/wordpress

irds:beir/dbpedia-entity

True

True

https://ir-datasets.com/beir.html#beir/dbpedia-entity

irds:beir/fever

True

True

https://ir-datasets.com/beir.html#beir/fever

irds:beir/fiqa

True

True

https://ir-datasets.com/beir.html#beir/fiqa

irds:beir/hotpotqa

True

True

https://ir-datasets.com/beir.html#beir/hotpotqa

irds:beir/msmarco

True

True

https://ir-datasets.com/beir.html#beir/msmarco

irds:beir/nfcorpus

True

[text, url]

https://ir-datasets.com/beir.html#beir/nfcorpus

irds:beir/nq

True

True

True

https://ir-datasets.com/beir.html#beir/nq

irds:beir/quora

True

True

https://ir-datasets.com/beir.html#beir/quora

irds:beir/scidocs

True

[text, authors, year, cited_by, references]

True

https://ir-datasets.com/beir.html#beir/scidocs

irds:beir/scifact

True

True

https://ir-datasets.com/beir.html#beir/scifact

irds:beir/trec-covid

True

[text, query, narrative]

True

https://ir-datasets.com/beir.html#beir/trec-covid

irds:beir/webis-touche2020

True

[text, description, narrative]

True

https://ir-datasets.com/beir.html#beir/webis-touche2020

irds:beir/webis-touche2020/v2

True

[text, description, narrative]

True

https://ir-datasets.com/beir.html#beir/webis-touche2020/v2

irds:c4

https://ir-datasets.com/c4.html

irds:c4/en-noclean-tr

True

https://ir-datasets.com/c4.html#c4/en-noclean-tr

irds:c4/en-noclean-tr/trec-misinfo-2021

True

[text, description, narrative, disclaimer, stance, evidence]

https://ir-datasets.com/c4.html#c4/en-noclean-tr/trec-misinfo-2021

irds:car

https://ir-datasets.com/car.html

irds:car/v1.5

True

https://ir-datasets.com/car.html#car/v1.5

irds:car/v1.5/test200

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/test200

irds:car/v1.5/train/fold0

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/train/fold0

irds:car/v1.5/train/fold1

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/train/fold1

irds:car/v1.5/train/fold2

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/train/fold2

irds:car/v1.5/train/fold3

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/train/fold3

irds:car/v1.5/train/fold4

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/train/fold4

irds:car/v1.5/trec-y1

True

[text, title, headings]

https://ir-datasets.com/car.html#car/v1.5/trec-y1

irds:car/v1.5/trec-y1/auto

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/trec-y1/auto

irds:car/v1.5/trec-y1/manual

True

[text, title, headings]

True

https://ir-datasets.com/car.html#car/v1.5/trec-y1/manual

irds:highwire

True

https://ir-datasets.com/highwire.html

irds:highwire/trec-genomics-2006

True

True

[start, length, relevance]

https://ir-datasets.com/highwire.html#highwire/trec-genomics-2006

irds:highwire/trec-genomics-2007

True

True

[start, length, relevance]

https://ir-datasets.com/highwire.html#highwire/trec-genomics-2007

irds:medline

https://ir-datasets.com/medline.html

irds:medline/2004

True

https://ir-datasets.com/medline.html#medline/2004

irds:medline/2004/trec-genomics-2004

True

[title, need, context]

True

https://ir-datasets.com/medline.html#medline/2004/trec-genomics-2004

irds:medline/2004/trec-genomics-2005

True

True

True

https://ir-datasets.com/medline.html#medline/2004/trec-genomics-2005

irds:medline/2017

True

https://ir-datasets.com/medline.html#medline/2017

irds:medline/2017/trec-pm-2017

True

[disease, gene, demographic, other]

True

https://ir-datasets.com/medline.html#medline/2017/trec-pm-2017

irds:medline/2017/trec-pm-2018

True

[disease, gene, demographic]

True

https://ir-datasets.com/medline.html#medline/2017/trec-pm-2018

irds:clinicaltrials

https://ir-datasets.com/clinicaltrials.html

irds:clinicaltrials/2017

True

https://ir-datasets.com/clinicaltrials.html#clinicaltrials/2017

irds:clinicaltrials/2017/trec-pm-2017

True

[disease, gene, demographic, other]

True

https://ir-datasets.com/clinicaltrials.html#clinicaltrials/2017/trec-pm-2017

irds:clinicaltrials/2017/trec-pm-2018

True

[disease, gene, demographic]

True

https://ir-datasets.com/clinicaltrials.html#clinicaltrials/2017/trec-pm-2018

irds:clinicaltrials/2019

True

https://ir-datasets.com/clinicaltrials.html#clinicaltrials/2019

irds:clinicaltrials/2019/trec-pm-2019

True

[disease, gene, demographic]

True

https://ir-datasets.com/clinicaltrials.html#clinicaltrials/2019/trec-pm-2019

irds:clinicaltrials/2021

True

https://ir-datasets.com/clinicaltrials.html#clinicaltrials/2021

irds:clinicaltrials/2021/trec-ct-2021

True

True

https://ir-datasets.com/clinicaltrials.html#clinicaltrials/2021/trec-ct-2021

irds:clirmatrix

https://ir-datasets.com/clirmatrix.html

irds:clueweb09/catb

True

https://ir-datasets.com/clueweb09.html#clueweb09/catb

irds:clueweb09/catb/trec-web-2009

True

[query, description, type, subtopics]

[relevance, method, iprob]

https://ir-datasets.com/clueweb09.html#clueweb09/catb/trec-web-2009

irds:clueweb09/catb/trec-web-2010

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb09.html#clueweb09/catb/trec-web-2010

irds:clueweb09/catb/trec-web-2011

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb09.html#clueweb09/catb/trec-web-2011

irds:clueweb09/catb/trec-web-2012

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb09.html#clueweb09/catb/trec-web-2012

irds:clueweb09/en

True

https://ir-datasets.com/clueweb09.html#clueweb09/en

irds:clueweb09/en/trec-web-2009

True

[query, description, type, subtopics]

[relevance, method, iprob]

https://ir-datasets.com/clueweb09.html#clueweb09/en/trec-web-2009

irds:clueweb09/en/trec-web-2010

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb09.html#clueweb09/en/trec-web-2010

irds:clueweb09/en/trec-web-2011

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb09.html#clueweb09/en/trec-web-2011

irds:clueweb09/en/trec-web-2012

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb09.html#clueweb09/en/trec-web-2012

irds:clueweb12

True

https://ir-datasets.com/clueweb12.html

irds:clueweb12/b13

True

https://ir-datasets.com/clueweb12.html#clueweb12/b13

irds:clueweb12/b13/clef-ehealth

True

True

[relevance, trustworthiness, understandability]

https://ir-datasets.com/clueweb12.html#clueweb12/b13/clef-ehealth

irds:clueweb12/b13/ntcir-www-1

True

True

True

https://ir-datasets.com/clueweb12.html#clueweb12/b13/ntcir-www-1

irds:clueweb12/b13/ntcir-www-2

True

[title, description]

True

https://ir-datasets.com/clueweb12.html#clueweb12/b13/ntcir-www-2

irds:clueweb12/b13/ntcir-www-3

True

[title, description]

https://ir-datasets.com/clueweb12.html#clueweb12/b13/ntcir-www-3

irds:clueweb12/b13/trec-misinfo-2019

True

[title, cochranedoi, description, narrative]

[relevance, effectiveness, redibility]

https://ir-datasets.com/clueweb12.html#clueweb12/b13/trec-misinfo-2019

irds:clueweb12/trec-web-2013

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb12.html#clueweb12/trec-web-2013

irds:clueweb12/trec-web-2014

True

[query, description, type, subtopics]

True

https://ir-datasets.com/clueweb12.html#clueweb12/trec-web-2014

irds:cord19

True

https://ir-datasets.com/cord19.html

irds:cord19/fulltext

True

https://ir-datasets.com/cord19.html#cord19/fulltext

irds:cord19/fulltext/trec-covid

True

[title, description, narrative]

True

https://ir-datasets.com/cord19.html#cord19/fulltext/trec-covid

irds:cord19/trec-covid

True

[title, description, narrative]

True

https://ir-datasets.com/cord19.html#cord19/trec-covid

irds:cord19/trec-covid/round1

True

[title, description, narrative]

True

https://ir-datasets.com/cord19.html#cord19/trec-covid/round1

irds:cord19/trec-covid/round2

True

[title, description, narrative]

True

https://ir-datasets.com/cord19.html#cord19/trec-covid/round2

irds:cord19/trec-covid/round3

True

[title, description, narrative]

True

https://ir-datasets.com/cord19.html#cord19/trec-covid/round3

irds:cord19/trec-covid/round4

True

[title, description, narrative]

True

https://ir-datasets.com/cord19.html#cord19/trec-covid/round4

irds:cord19/trec-covid/round5

True

[title, description, narrative]

True

https://ir-datasets.com/cord19.html#cord19/trec-covid/round5

irds:cranfield

True

True

True

https://ir-datasets.com/cranfield.html

irds:dpr-w100

True

https://ir-datasets.com/dpr-w100.html

irds:dpr-w100/natural-questions/dev

True

[text, answers]

True

https://ir-datasets.com/dpr-w100.html#dpr-w100/natural-questions/dev

irds:dpr-w100/natural-questions/train

True

[text, answers]

True

https://ir-datasets.com/dpr-w100.html#dpr-w100/natural-questions/train

irds:dpr-w100/trivia-qa/dev

True

[text, answers]

True

https://ir-datasets.com/dpr-w100.html#dpr-w100/trivia-qa/dev

irds:dpr-w100/trivia-qa/train

True

[text, answers]

True

https://ir-datasets.com/dpr-w100.html#dpr-w100/trivia-qa/train

irds:gov

True

https://ir-datasets.com/gov.html

irds:gov/trec-web-2002

True

[title, description, narrative]

True

https://ir-datasets.com/gov.html#gov/trec-web-2002

irds:gov/trec-web-2002/named-page

True

True

True

https://ir-datasets.com/gov.html#gov/trec-web-2002/named-page

irds:gov/trec-web-2003

True

[title, description]

True

https://ir-datasets.com/gov.html#gov/trec-web-2003

irds:gov/trec-web-2003/named-page

True

True

True

https://ir-datasets.com/gov.html#gov/trec-web-2003/named-page

irds:gov/trec-web-2004

True

True

True

https://ir-datasets.com/gov.html#gov/trec-web-2004

irds:gov2

True

https://ir-datasets.com/gov2.html

irds:gov2/trec-mq-2008

True

True

[relevance, method, iprob]

https://ir-datasets.com/gov2.html#gov2/trec-mq-2008

irds:gov2/trec-tb-2004

True

[title, description, narrative]

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2004

irds:gov2/trec-tb-2005

True

[title, description, narrative]

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2005

irds:gov2/trec-tb-2005/efficiency

True

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2005/efficiency

irds:gov2/trec-tb-2005/named-page

True

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2005/named-page

irds:gov2/trec-tb-2006

True

[title, description, narrative]

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006

irds:gov2/trec-tb-2006/efficiency

True

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006/efficiency

irds:gov2/trec-tb-2006/efficiency/10k

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006/efficiency/10k

irds:gov2/trec-tb-2006/efficiency/stream1

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006/efficiency/stream1

irds:gov2/trec-tb-2006/efficiency/stream2

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006/efficiency/stream2

irds:gov2/trec-tb-2006/efficiency/stream3

True

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006/efficiency/stream3

irds:gov2/trec-tb-2006/efficiency/stream4

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006/efficiency/stream4

irds:gov2/trec-tb-2006/named-page

True

True

True

https://ir-datasets.com/gov2.html#gov2/trec-tb-2006/named-page

irds:msmarco-passage

True

https://ir-datasets.com/msmarco-passage.html

irds:msmarco-passage/dev

True

True

True

https://ir-datasets.com/msmarco-passage.html#msmarco-passage/dev

irds:msmarco-passage/dev/small

True

True

True

https://ir-datasets.com/msmarco-passage.html#msmarco-passage/dev/small

irds:msmarco-passage/eval

True

True

https://ir-datasets.com/msmarco-passage.html#msmarco-passage/eval

irds:msmarco-passage/eval/small

True

True

https://ir-datasets.com/msmarco-passage.html#msmarco-passage/eval/small

irds:msmarco-passage/train

True

True

True

https://ir-datasets.com/msmarco-passage.html#msmarco-passage/train

irds:msmarco-passage/trec-dl-2019

True

True

True

https://ir-datasets.com/msmarco-passage.html#msmarco-passage/trec-dl-2019

irds:msmarco-passage/trec-dl-2020

True

True

True

https://ir-datasets.com/msmarco-passage.html#msmarco-passage/trec-dl-2020

irds:mmarco

https://ir-datasets.com/mmarco.html

irds:mr-tydi

https://ir-datasets.com/mr-tydi.html

irds:mr-tydi/en

True

True

True

https://ir-datasets.com/mr-tydi.html#mr-tydi/en

irds:mr-tydi/en/dev

True

True

True

https://ir-datasets.com/mr-tydi.html#mr-tydi/en/dev

irds:mr-tydi/en/test

True

True

True

https://ir-datasets.com/mr-tydi.html#mr-tydi/en/test

irds:mr-tydi/en/train

True

True

True

https://ir-datasets.com/mr-tydi.html#mr-tydi/en/train

irds:msmarco-document

True

https://ir-datasets.com/msmarco-document.html

irds:msmarco-document/dev

True

True

True

https://ir-datasets.com/msmarco-document.html#msmarco-document/dev

irds:msmarco-document/eval

True

True

https://ir-datasets.com/msmarco-document.html#msmarco-document/eval

irds:msmarco-document/orcas

True

True

True

https://ir-datasets.com/msmarco-document.html#msmarco-document/orcas

irds:msmarco-document/train

True

True

True

https://ir-datasets.com/msmarco-document.html#msmarco-document/train

irds:msmarco-document/trec-dl-2019

True

True

True

https://ir-datasets.com/msmarco-document.html#msmarco-document/trec-dl-2019

irds:msmarco-document/trec-dl-2020

True

True

True

https://ir-datasets.com/msmarco-document.html#msmarco-document/trec-dl-2020

irds:msmarco-document-v2

True

https://ir-datasets.com/msmarco-document-v2.html

irds:msmarco-document-v2/dev1

True

True

True

https://ir-datasets.com/msmarco-document-v2.html#msmarco-document-v2/dev1

irds:msmarco-document-v2/dev2

True

True

True

https://ir-datasets.com/msmarco-document-v2.html#msmarco-document-v2/dev2

irds:msmarco-document-v2/train

True

True

True

https://ir-datasets.com/msmarco-document-v2.html#msmarco-document-v2/train

irds:msmarco-document-v2/trec-dl-2019

True

True

True

https://ir-datasets.com/msmarco-document-v2.html#msmarco-document-v2/trec-dl-2019

irds:msmarco-document-v2/trec-dl-2020

True

True

True

https://ir-datasets.com/msmarco-document-v2.html#msmarco-document-v2/trec-dl-2020

irds:msmarco-document-v2/trec-dl-2021

True

True

True

https://ir-datasets.com/msmarco-document-v2.html#msmarco-document-v2/trec-dl-2021

irds:msmarco-passage-v2

True

https://ir-datasets.com/msmarco-passage-v2.html

irds:msmarco-passage-v2/dev1

True

True

True

https://ir-datasets.com/msmarco-passage-v2.html#msmarco-passage-v2/dev1

irds:msmarco-passage-v2/dev2

True

True

True

https://ir-datasets.com/msmarco-passage-v2.html#msmarco-passage-v2/dev2

irds:msmarco-passage-v2/train

True

True

True

https://ir-datasets.com/msmarco-passage-v2.html#msmarco-passage-v2/train

irds:msmarco-passage-v2/trec-dl-2021

True

True

True

https://ir-datasets.com/msmarco-passage-v2.html#msmarco-passage-v2/trec-dl-2021

irds:msmarco-qna

True

https://ir-datasets.com/msmarco-qna.html

irds:msmarco-qna/dev

True

[text, type, answers]

True

https://ir-datasets.com/msmarco-qna.html#msmarco-qna/dev

irds:msmarco-qna/eval

True

[text, type]

https://ir-datasets.com/msmarco-qna.html#msmarco-qna/eval

irds:msmarco-qna/train

True

[text, type, answers]

True

https://ir-datasets.com/msmarco-qna.html#msmarco-qna/train

irds:nfcorpus

True

https://ir-datasets.com/nfcorpus.html

irds:nfcorpus/dev

True

[title, all]

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/dev

irds:nfcorpus/dev/nontopic

True

True

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/dev/nontopic

irds:nfcorpus/dev/video

True

[title, desc]

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/dev/video

irds:nfcorpus/test

True

[title, all]

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/test

irds:nfcorpus/test/nontopic

True

True

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/test/nontopic

irds:nfcorpus/test/video

True

[title, desc]

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/test/video

irds:nfcorpus/train

True

[title, all]

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/train

irds:nfcorpus/train/nontopic

True

True

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/train/nontopic

irds:nfcorpus/train/video

True

[title, desc]

True

https://ir-datasets.com/nfcorpus.html#nfcorpus/train/video

irds:natural-questions

True

https://ir-datasets.com/natural-questions.html

irds:natural-questions/dev

True

True

[relevance, short_answers, yes_no_answer]

https://ir-datasets.com/natural-questions.html#natural-questions/dev

irds:natural-questions/train

True

True

[relevance, short_answers, yes_no_answer]

https://ir-datasets.com/natural-questions.html#natural-questions/train

irds:nyt

True

https://ir-datasets.com/nyt.html

irds:nyt/trec-core-2017

True

[title, description, narrative]

True

https://ir-datasets.com/nyt.html#nyt/trec-core-2017

irds:nyt/wksup

True

True

True

https://ir-datasets.com/nyt.html#nyt/wksup

irds:pmc

https://ir-datasets.com/pmc.html

irds:pmc/v1

True

https://ir-datasets.com/pmc.html#pmc/v1

irds:pmc/v1/trec-cds-2014

True

[type, description, summary]

True

https://ir-datasets.com/pmc.html#pmc/v1/trec-cds-2014

irds:pmc/v1/trec-cds-2015

True

[type, description, summary]

True

https://ir-datasets.com/pmc.html#pmc/v1/trec-cds-2015

irds:pmc/v2

True

https://ir-datasets.com/pmc.html#pmc/v2

irds:pmc/v2/trec-cds-2016

True

[type, note, description, summary]

True

https://ir-datasets.com/pmc.html#pmc/v2/trec-cds-2016

irds:argsme/2020-04-01/touche-2020-task-1

True

[title, description, narrative]

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01/touche-2020-task-1

irds:clueweb12/touche-2020-task-2

True

[title, description, narrative]

True

https://ir-datasets.com/clueweb12.html#clueweb12/touche-2020-task-2

irds:argsme/2020-04-01/touche-2021-task-1

True

True

[relevance, quality]

https://ir-datasets.com/argsme.html#argsme/2020-04-01/touche-2021-task-1

irds:clueweb12/touche-2021-task-2

True

[title, description, narrative]

[relevance, quality]

https://ir-datasets.com/clueweb12.html#clueweb12/touche-2021-task-2

irds:argsme/1.0/touche-2020-task-1/uncorrected

True

[title, description, narrative]

True

https://ir-datasets.com/argsme.html#argsme/1.0/touche-2020-task-1/uncorrected

irds:argsme/2020-04-01/touche-2020-task-1/uncorrected

True

[title, description, narrative]

True

https://ir-datasets.com/argsme.html#argsme/2020-04-01/touche-2020-task-1/uncorrected

irds:trec-robust04

True

[title, description, narrative]

True

https://ir-datasets.com/trec-robust04.html

irds:tripclick

True

https://ir-datasets.com/tripclick.html

irds:tripclick/logs

True

https://ir-datasets.com/tripclick.html#tripclick/logs

irds:tripclick/test

True

True

https://ir-datasets.com/tripclick.html#tripclick/test

irds:tripclick/test/head

True

True

https://ir-datasets.com/tripclick.html#tripclick/test/head

irds:tripclick/test/tail

True

True

https://ir-datasets.com/tripclick.html#tripclick/test/tail

irds:tripclick/test/torso

True

True

https://ir-datasets.com/tripclick.html#tripclick/test/torso

irds:tripclick/train

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/train

irds:tripclick/train/head

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/train/head

irds:tripclick/train/head/dctr

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/train/head/dctr

irds:tripclick/train/tail

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/train/tail

irds:tripclick/train/torso

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/train/torso

irds:tripclick/val

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/val

irds:tripclick/val/head

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/val/head

irds:tripclick/val/head/dctr

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/val/head/dctr

irds:tripclick/val/tail

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/val/tail

irds:tripclick/val/torso

True

True

True

https://ir-datasets.com/tripclick.html#tripclick/val/torso

irds:vaswani

True

True

True

https://ir-datasets.com/vaswani.html

irds:wapo

https://ir-datasets.com/wapo.html

irds:wapo/v2

True

https://ir-datasets.com/wapo.html#wapo/v2

irds:wapo/v2/trec-core-2018

True

[title, description, narrative]

True

https://ir-datasets.com/wapo.html#wapo/v2/trec-core-2018

irds:wapo/v2/trec-news-2018

True

[doc_id, url]

True

https://ir-datasets.com/wapo.html#wapo/v2/trec-news-2018

irds:wapo/v2/trec-news-2019

True

[doc_id, url]

True

https://ir-datasets.com/wapo.html#wapo/v2/trec-news-2019

irds:wapo/v3/trec-news-2020

[doc_id, url]

True

https://ir-datasets.com/wapo.html#wapo/v3/trec-news-2020

irds:wikir

https://ir-datasets.com/wikir.html

irds:wikir/en1k

True

https://ir-datasets.com/wikir.html#wikir/en1k

irds:wikir/en1k/test

True

True

True

https://ir-datasets.com/wikir.html#wikir/en1k/test

irds:wikir/en1k/training

True

True

True

https://ir-datasets.com/wikir.html#wikir/en1k/training

irds:wikir/en1k/validation

True

True

True

https://ir-datasets.com/wikir.html#wikir/en1k/validation

irds:wikir/en59k

True

https://ir-datasets.com/wikir.html#wikir/en59k

irds:wikir/en59k/test

True

True

True

https://ir-datasets.com/wikir.html#wikir/en59k/test

irds:wikir/en59k/training

True

True

True

https://ir-datasets.com/wikir.html#wikir/en59k/training

irds:wikir/en59k/validation

True

True

True

https://ir-datasets.com/wikir.html#wikir/en59k/validation

irds:wikir/en78k

True

https://ir-datasets.com/wikir.html#wikir/en78k

irds:wikir/en78k/test

True

True

True

https://ir-datasets.com/wikir.html#wikir/en78k/test

irds:wikir/en78k/training

True

True

True

https://ir-datasets.com/wikir.html#wikir/en78k/training

irds:wikir/en78k/validation

True

True

True

https://ir-datasets.com/wikir.html#wikir/en78k/validation

irds:wikir/ens78k

True

https://ir-datasets.com/wikir.html#wikir/ens78k

irds:wikir/ens78k/test

True

True

True

https://ir-datasets.com/wikir.html#wikir/ens78k/test

irds:wikir/ens78k/training

True

True

True

https://ir-datasets.com/wikir.html#wikir/ens78k/training

irds:wikir/ens78k/validation

True

True

True

https://ir-datasets.com/wikir.html#wikir/ens78k/validation

irds:trec-fair-2021

True

https://ir-datasets.com/trec-fair-2021.html

irds:trec-fair-2021/train

True

[text, keywords, scope, homepage]

True

https://ir-datasets.com/trec-fair-2021.html#trec-fair-2021/train

irds:trec-fair-2021/eval

True

[text, keywords, scope]

https://ir-datasets.com/trec-fair-2021.html#trec-fair-2021/eval

trec-deep-learning-docs

True

True

[train, dev, test, test-2020, leaderboard-2020]

[train, dev, test, test-2020]

https://microsoft.github.io/msmarco/

trec-deep-learning-passages

True

True

[train, dev, dev.small, eval, eval.small, test-2019, test-2020]

[train, dev, test-2019, test-2020, dev.small]

https://microsoft.github.io/MSMARCO-Passage-Ranking/