Result Fusion¶

class pyterrier_alpha.RRFusion(*transformers, k=60, num_results=1000)[source]¶

Reciprocal Rank Fusion between the results from multiple transformers.

This transformer merges multiple ranking results by computing the reciprocal rank of each document in each ranking, and summing them up. The reciprocal rank is computed as 1/(rank + k), where k is a constant. The resulting score is used to rank the documents.

Consider using the rr_fusion() function if you want to apply fusion outside of a pipeline.

Citation

Cormack et al. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. SIGIR 2009. [link]

@inproceedings{DBLP:conf/sigir/CormackCB09,
  author       = {Gordon V. Cormack and
                  Charles L. A. Clarke and
                  Stefan B{\"{u}}ttcher},
  editor       = {James Allan and
                  Javed A. Aslam and
                  Mark Sanderson and
                  ChengXiang Zhai and
                  Justin Zobel},
  title        = {Reciprocal rank fusion outperforms condorcet and individual rank learning
                  methods},
  booktitle    = {Proceedings of the 32nd Annual International {ACM} {SIGIR} Conference
                  on Research and Development in Information Retrieval, {SIGIR} 2009,
                  Boston, MA, USA, July 19-23, 2009},
  pages        = {758--759},
  publisher    = {{ACM}},
  year         = {2009},
  url          = {https://doi.org/10.1145/1571941.1572114},
  doi          = {10.1145/1571941.1572114},
  timestamp    = {Wed, 14 Nov 2018 10:58:10 +0100},
  biburl       = {https://dblp.org/rec/conf/sigir/CormackCB09.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Initializes the transformer.

Parameters:

transformers (Transformer) – The transformers to merge.
k (int) – The constant used in the reciprocal rank computation.
num_results (int | None) – The number of results to keep for each query. If None, all results are kept.

transform(inp)[source]¶

Performs the reciprocal rank fusion on the input data.

Return type:: DataFrame
Parameters:: inp (DataFrame)

pyterrier_alpha.fusion.rr_fusion(*results, k=60, num_results=1000)[source]¶

Reciprocal Rank Fusion between two ranking result lists.

Return type:

DataFrame

Parameters:

results (DataFrame) – Multiple result frames to merge. At least one frame is required.
k (int) – The constant used in the reciprocal rank computation.
num_results (int | None) – The number of results to keep for each query. If None, all results are kept.

Consider using RRFusion if you want to use this directly in a pipeline.

Citation

Cormack et al. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. SIGIR 2009. [link]

@inproceedings{DBLP:conf/sigir/CormackCB09,
  author       = {Gordon V. Cormack and
                  Charles L. A. Clarke and
                  Stefan B{\"{u}}ttcher},
  editor       = {James Allan and
                  Javed A. Aslam and
                  Mark Sanderson and
                  ChengXiang Zhai and
                  Justin Zobel},
  title        = {Reciprocal rank fusion outperforms condorcet and individual rank learning
                  methods},
  booktitle    = {Proceedings of the 32nd Annual International {ACM} {SIGIR} Conference
                  on Research and Development in Information Retrieval, {SIGIR} 2009,
                  Boston, MA, USA, July 19-23, 2009},
  pages        = {758--759},
  publisher    = {{ACM}},
  year         = {2009},
  url          = {https://doi.org/10.1145/1571941.1572114},
  doi          = {10.1145/1571941.1572114},
  timestamp    = {Wed, 14 Nov 2018 10:58:10 +0100},
  biburl       = {https://dblp.org/rec/conf/sigir/CormackCB09.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}