.. _pyterrier.ltr: Learning to Rank ---------------- Introduction ============ PyTerrier makes it easy to formulate learning to rank pipelines. Conceptually, learning to rank consists of three phases: 1. identifying a candidate set of documents for each query 2. computing extra features on these documents 3. using a learned model to re-rank the candidate documents to obtain a more effective ranking PyTerrier allows each of these phases to be expressed as transformers, and for them to be composed into a full pipeline. In particular, conventional retrieval transformers (such as `pt.terrier.Retriever`) can be used for the first phase. To permit the second phase, PyTerrier data model allows for a `"features"` column to be associated to each retrieved document. Such features can be generated using specialised transformers, or by combining other re-ranking transformers using the `**` feature-union operator; Lastly, to facilitate the final phase, we provide easy ways to integrate PyTerrier pipelines with standard learning libraries such as `sklearn `_, `XGBoost `_ and `LightGBM `_. In the following, we focus on the second and third phases, as well as describe ways to assist in conducting learning to rank experiments. Calculating Features ==================== Feature Union (`**`) ~~~~~~~~~~~~~~~~~~~~ PyTerrier's main way to faciliate calculating and intgrating extra features is through the `**` operator. Consider an example where the candidate set should be identified using the BM25 weighting model, and then additional features computed using the Tf and PL2 models: .. schematic:: :show_code: index = pt.terrier.TerrierIndex.example() # FOLD bm25 = index.retriever("BM25") tf = index.retriever("Tf") pl2 = index.retriever("PL2") pipeline = bm25 >> (tf ** pl2) The output of the bm25 ranker would look like: ==== ========== ======== ============ .. qid docno score ==== ========== ======== ============ 1 q1 d5 (bm25 score) ==== ========== ======== ============ Application of the feature-union operator (`**`) ensures that `tf` and `pl2` operate as *re-rankers*, i.e. they are applied only on the documents retrieved by `bm25`. For each document, the score calculate by `tf` and `pl2` are combined into the `"features"` column, as follows: ==== ========== ======== ============ ========================= .. qid docno score features ==== ========== ======== ============ ========================= 1 q1 d5 (bm25 score) [tf score, pl2 score] ==== ========== ======== ============ ========================= Including Features during Retrieval ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. related:: pyterrier.terrier.FeaturesRetriever When executing the pipeline above, the re-ranking of the documents again can be slow, as each separate Retriever object has to re-access the inverted index. For this reason, the Terrier engine provides a class called :class:`~pyterrier.terrier.FeaturesRetriever`, which allows multiple query dependent features to be calculated at once, by virtue of Terrier's ``Fat`` framework. Therefore, these two pipelines are equivalent: .. code-block:: python :caption: Example of FeaturesRetriever pipeline1 = bm25 >> (tf ** pl2) # :footnote: ``pipeline1`` uses separate retrievers to compute each feature. pipeline2 = pt.terrier.FeaturesRetriever(index, wmodel="BM25", features=["WMODEL:Tf", "WMODEL:PL2"]) # :footnote: ``pipeline2`` uses a single retriever that computes all features at once. .. schematic:: index = pt.terrier.TerrierIndex.example() pt.terrier.FeaturesRetriever(index.index_obj(), wmodel="BM25", features=["WMODEL:Tf", "WMODEL:PL2"]) Apply Functions for Custom Features ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If you have a way to calculate one or multiple ranking features at once, you can use pt.apply functions to create your feature sets. See :ref:`pyterrier.apply` for more examples. In particular, use ``pt.apply.doc_score()`` for calculating a single feature based on a function. Transformers created by pt.apply can be combined using the `**` operator. For instance, consider you have two functions that each return one score that are to be used as features. We can instantiate these functions as Transformers using ``pt.apply.doc_score()``. Such custom features can both be combined into a LTR pipeline using the ``**`` operator: .. schematic:: :show_code: index = pt.terrier.TerrierIndex.example() bm25 = index.bm25() # FOLD featureA = pt.apply.doc_score(lambda row: 5) featureB = pt.apply.doc_score(lambda row: 2) pipeline = bm25 >> (featureA ** featureB) The output of ``pipeline`` would be as follows: ==== ========== ======== ============ ========================= .. qid docno score features ==== ========== ======== ============ ========================= 1 q1 d5 (bm25 score) [5, 2] ==== ========== ======== ============ ========================= Of course, our example lambda functions return static scores for each document rather than computing meaningful features, for instance making a lookup based on ``row["docid"]`` or other attributes of each ``row``. If we want to calculate *more than one* feature at once, then we can go faster by using ``pt.apply.doc_features()``:: two_features = pt.apply.doc_features(lambda row: np.array([0,1])) # use doc_features ONLY when calculating multiple features one_feature = pt.apply.doc_score(lambda row: 5) # use doc_score when calculating a single feature pipeline3f = bm25 >> (two_features ** one_feature) .. schematic:: import numpy as np index = pt.terrier.TerrierIndex.example() bm25 = index.bm25() two_features = pt.apply.doc_features(lambda row: [0, 1]) # np.array doesn't work in the lambda for some reason? one_feature = pt.apply.doc_score(lambda row: 5) pipeline3f = bm25 >> (two_features ** one_feature) pipeline3f The output of ``pipeline3f`` would be as follows: ==== ========== ======== ============ ========================= .. qid docno score features ==== ========== ======== ============ ========================= 1 q1 d5 (bm25 score) [0, 1, 5] ==== ========== ======== ============ ========================= Learning ======== .. autofunction:: pyterrier.ltr.apply_learned_model() The resulting transformer implements Estimator, in other words it has a `fit()` method, that can be trained using training topics and qrels, as well as (optionally) validation topics and qrels. See also :ref:`pt.transformer.estimator`. At inference time, the Estimator can be applied to new topics, and it will use the learned model to re-rank the candidate documents based on the features calculated in the previous phase. The resulting pipeline is shown below: .. schematic:: index = pt.terrier.TerrierIndex.example() pipeline2f = pt.terrier.FeaturesRetriever(index.index_obj(), wmodel="BM25", features=["WMODEL:Tf", "WMODEL:PL2"]) from sklearn.ensemble import RandomForestRegressor rf = RandomForestRegressor(n_estimators=400) rf_pipe = pipeline2f >> pt.ltr.apply_learned_model(rf) rf_pipe A number of learning algorithms are supported, namely from scikit-learn, XGBoost, LightGBM and FastRank - see below for details. scikit-learn ~~~~~~~~~~~~ A sklearn regressor can be passed directly to `pt.ltr.apply_learned_model()`:: from sklearn.ensemble import RandomForestRegressor rf = RandomForestRegressor(n_estimators=400) rf_pipe = pipeline >> pt.ltr.apply_learned_model(rf) rf_pipe.fit(train_topics, qrels) pt.Experiment([bm25, rf_pipe], test_topics, qrels, ["map"], names=["BM25 Baseline", "LTR"]) Note that if the feature definitions in the pipeline change, you will need to create a new instance of `rf`. For analysis purposes, the feature importances identified by RandomForestRegressor can be accessed through `rf.feature_importances_` - see the `relevant sklearn documentation `_ for more information. Gradient Boosted Trees & LambdaMART ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Both `XGBoost `_ and `LightGBM `_ provide gradient boosted regression tree and LambdaMART implementations. These support a sklearn-like interface that is supported by PyTerrier by supplying `form="ltr"` kwarg to `pt.ltr.apply_learned_model()`:: import xgboost as xgb # this configures XGBoost as LambdaMART lmart_x = xgb.sklearn.XGBRanker(objective='rank:ndcg', learning_rate=0.1, gamma=1.0, min_child_weight=0.1, max_depth=6, verbose=2, random_state=42) lmart_x_pipe = pipeline >> pt.ltr.apply_learned_model(lmart_x, form="ltr") lmart_x_pipe.fit(train_topics, train_qrels, validation_topics, validation_qrels) import lightgbm as lgb # this configures LightGBM as LambdaMART lmart_l = lgb.LGBMRanker(task="train", min_data_in_leaf=1, min_sum_hessian_in_leaf=100, max_bin=255, num_leaves=7, objective="lambdarank", metric="ndcg", ndcg_eval_at=[1, 3, 5, 10], learning_rate= .1, importance_type="gain", num_iterations=10) lmart_l_pipe = pipeline >> pt.ltr.apply_learned_model(lmart_l, form="ltr") lmart_l_pipe.fit(train_topics, train_qrels, validation_topics, validation_qrels) pt.Experiment( [bm25, lmart_x_pipe, lmart_l_pipe], test_topics, test_qrels, ["map"], names=["BM25 Baseline", "LambdaMART (xgBoost)", "LambdaMART (LightGBM)" ] ) Note that if the feature definitions in the pipeline change, you will need to create a new instance of XGBRanker (or LGBMRanker, as appropriate) and the ``pt.ltr.apply_learned_model()`` transformer. If you attempt to reuse XGBRanker/LGBMRanker within different pipelines, the ``pt.ltr.apply_learned_model()`` transformer will try to warn you about this by raising a ``ValueError`` with `Expected X number of features, but found Y features`. In our experience, LightGBM *tends* to be more effective than xgBoost. Similar to sklearn, both XGBoost and LightGBM provide feature importances via `lmart_x.feature_importances_` and `lmart_l.feature_importances_`. FastRank: Coordinate Ascent ~~~~~~~~~~~~~~~~~~~~~~~~~~~ We now support `FastRank `_ for learning models:: !pip install fastrank import fastrank train_request = fastrank.TrainRequest.coordinate_ascent() params = train_request.params params.init_random = True params.normalize = True params.seed = 1234567 ca_pipe = pipeline >> pt.ltr.apply_learned_model(train_request, form="fastrank") ca_pipe.fit(train_topics, train_qrels) FastRank provides two learners: a random forest implementation (`fastrank.TrainRequest.random_forest()`) and coordinate ascent (`fastrank.TrainRequest.coordinate_ascent()`), a linear model. Working with Features ===================== We provide additional transformations functions to aid the analysis of learned model, for instance, removing (ablating) features from a complex ranking pipeline. .. autofunction:: pyterrier.ltr.ablate_features() Example:: # assume pipeline is a retrieval pipeline that produces four ranking features numf=4 rankers = [] names = [] # learn a model for all four features full = pipeline >> pt.ltr.apply_learned_model(RandomForestRegressor(n_estimators=400)) full.fit(trainTopics, trainQrels, validTopics, validQrels) rankers.append(full) # learn a model for 3 features, removing one each time for fid in range(numf): ablated = pipeline >> pt.ltr.ablate_features(fid) >> pt.ltr.apply_learned_model(RandomForestRegressor(n_estimators=400)) ablated.fit(trainTopics, trainQrels, validTopics, validQrels) rankers.append(ablated) # evaluate the full (4 features) model, as well as the each model containing only 3 features) pt.Experiment( rankers, test_topics, test_qrels, ["map"], names=["Full Model"] + ["Full Minus %d" % fid for fid in range(numf) ) .. autofunction:: pyterrier.ltr.keep_features() .. autofunction:: pyterrier.ltr.feature_to_score() .. autofunction:: pyterrier.ltr.score_to_feature()