Learning to Rank¶
Introduction¶
PyTerrier makes it easy to formulate learning to rank pipelines. Conceptually, learning to rank consists of three phases:
identifying a candidate set of documents for each query
computing extra features on these documents
using a learned model to re-rank the candidate documents to obtain a more effective ranking
PyTerrier allows each of these phases to be expressed as transformers, and for them to be composed into a full pipeline.
In particular, conventional retrieval transformers (such as pt.terrier.Retriever) can be used for the first phase. To permit the second phase, PyTerrier data model allows for a “features” column to be associated to each retrieved document. Such features can be generated using specialised transformers, or by combining other re-ranking transformers using the ** feature-union operator; Lastly, to facilitate the final phase, we provide easy ways to integrate PyTerrier pipelines with standard learning libraries such as sklearn, XGBoost and LightGBM.
In the following, we focus on the second and third phases, as well as describe ways to assist in conducting learning to rank experiments.
Calculating Features¶
Feature Union (**)¶
PyTerrier’s main way to faciliate calculating and intgrating extra features is through the ** operator. Consider an example where the candidate set should be identified using the BM25 weighting model, and then additional features computed using the Tf and PL2 models:
bm25 = pt.terrier.Retriever(index, wmodel="BM25")
tf = pt.terrier.Retriever(index, wmodel="Tf")
pl2 = pt.terrier.Retriever(index, wmodel="PL2")
pipeline = bm25 >> (tf ** pl2)
The output of the bm25 ranker would look like:
qid |
docno |
score |
|
---|---|---|---|
1 |
q1 |
d5 |
(bm25 score) |
Application of the feature-union operator (**) ensures that tf and pl2 operate as re-rankers, i.e. they are applied only on the documents retrieved by bm25. For each document, the score calculate by tf and pl2 are combined into the “features” column, as follows:
qid |
docno |
score |
features |
|
---|---|---|---|---|
1 |
q1 |
d5 |
(bm25 score) |
[tf score, pl2 score] |
FeaturesRetriever¶
When executing the pipeline above, the re-ranking of the documents again can be slow, as each separate Retriever object has to re-access the inverted index. For this reason, PyTerrier provides a class called FeaturesRetriever, which allows multiple query dependent features to be calculated at once, by virtue of Terrier’s Fat framework.
- class pyterrier.terrier.FeaturesRetriever(*args, **kwargs)[source]¶
Use this class for retrieval with multiple features
Init method
- Parameters:
index_location – An index-like object - An Index, an IndexRef, or a String that can be resolved to an IndexRef
features (list) – List of features to use
controls (dict) – A dictionary with the control names and values
properties (dict) – A dictionary with the property keys and values
verbose (bool) – If True transform method will display progress
num_results (int) – Number of results to retrieve.
An equivalent pipeline to the example above would be:
#pipeline = bm25 >> (tf ** pl2)
pipeline = pyterrier.terrier.FeaturesRetriever(index, wmodel="BM25", features=["WMODEL:Tf", "WMODEL:PL2"])
Apply Functions for Custom Features¶
If you have a way to calculate one or multiple ranking features at once, you can use pt.apply functions to create your feature sets. See pyterrier.apply - Custom Transformers for more examples. In particular, use pt.apply.doc_score for calculating a single feature based on a function. Transformers created by pt.apply can be combined using the ** operator.
For instance, consider you have two functions that each return one score that are to be used as features. We can instantiate these functions as Transformers using pt.apply.doc_score() invocations. Such custom features can both be combined into a LTR pipeline using the ** operator:
featureA = pt.apply.doc_score(lambda row: 5)
featureB = pt.apply.doc_score(lambda row: 2)
pipeline2f = bm25 >> (featureA ** featureB)
The output of pipeline2f
would be as follows:
qid |
docno |
score |
features |
|
---|---|---|---|---|
1 |
q1 |
d5 |
(bm25 score) |
[5, 2] |
Of course, our example lambda functions return static scores for each document rather than computing meaningful features,
for instance making a lookup based on row["docid"]
or other attributes of each row
.
If we want to calculate more features at once, then we can go faster by using pt.apply.doc_features:
two_features = pt.apply.doc_features(lambda row: np.array([0,1])) # use doc_features when calculating multiple features
one_feature = pt.apply.doc_score(lambda row: 5) # use doc_score when calculating a single feature
pipeline3f = bm25 >> (two_features ** one_feature)
The output of pipeline3f
would be as follows:
qid |
docno |
score |
features |
|
---|---|---|---|---|
1 |
q1 |
d5 |
(bm25 score) |
[0, 1, 5] |
Learning¶
- pyterrier.ltr.apply_learned_model(learner, form='regression', **kwargs)[source]¶
Results in a transformer that can take in documents that have a “features” column, and pass that to the specified learner via its transform() function, to obtain the documents’ “score” column. Learners should follow the sklearn’s general pattern with a fit() method ( c.f. an sklearn Estimator) and a predict() method.
xgBoost and LightGBM are also supported through the use of type=’ltr’ kwarg.
- Return type:
- Parameters:
learner – an sklearn-compatible estimator
form (str) – either ‘regression’, ‘ltr’ or ‘fastrank’
The resulting transformer implements Estimator, in other words it has a fit() method, that can be trained using training topics and qrels, as well as (optionally) validation topics and qrels. See also Estimator.
SKLearn¶
A sklearn regressor can be passed directly to pt.ltr.apply_learned_model():
from sklearn.ensemble import RandomForestRegressor
rf = RandomForestRegressor(n_estimators=400)
rf_pipe = pipeline >> pt.ltr.apply_learned_model(rf)
rf_pipe.fit(train_topics, qrels)
pt.Experiment([bm25, rf_pipe], test_topics, qrels, ["map"], names=["BM25 Baseline", "LTR"])
Note that if the feature definitions in the pipeline change, you will need to create a new instance of rf.
For analysis purposes, the feature importances identified by RandomForestRegressor can be accessed through rf.feature_importances_ - see the relevant sklearn documentation for more information.
Gradient Boosted Trees & LambdaMART¶
Both XGBoost and LightGBM provide gradient boosted regression tree and LambdaMART implementations. These support a sklearn-like interface that is supported by PyTerrier by supplying form=”ltr” kwarg to pt.ltr.apply_learned_model():
import xgboost as xgb
# this configures XGBoost as LambdaMART
lmart_x = xgb.sklearn.XGBRanker(objective='rank:ndcg',
learning_rate=0.1,
gamma=1.0,
min_child_weight=0.1,
max_depth=6,
verbose=2,
random_state=42)
lmart_x_pipe = pipeline >> pt.ltr.apply_learned_model(lmart_x, form="ltr")
lmart_x_pipe.fit(train_topics, train_qrels, validation_topics, validation_qrels)
import lightgbm as lgb
# this configures LightGBM as LambdaMART
lmart_l = lgb.LGBMRanker(task="train",
min_data_in_leaf=1,
min_sum_hessian_in_leaf=100,
max_bin=255,
num_leaves=7,
objective="lambdarank",
metric="ndcg",
ndcg_eval_at=[1, 3, 5, 10],
learning_rate= .1,
importance_type="gain",
num_iterations=10)
lmart_l_pipe = pipeline >> pt.ltr.apply_learned_model(lmart_l, form="ltr")
lmart_l_pipe.fit(train_topics, train_qrels, validation_topics, validation_qrels)
pt.Experiment(
[bm25, lmart_x_pipe, lmart_l_pipe],
test_topics,
test_qrels,
["map"],
names=["BM25 Baseline", "LambdaMART (xgBoost)", "LambdaMART (LightGBM)" ]
)
Note that if the feature definitions in the pipeline change, you will need to create a new instance of XGBRanker (or LGBMRanker, as appropriate)
and the pt.ltr.apply_learned_model()
transformer. If you attempt to reuse XGBRanker/LGBMRanker within different pipelines, the
pt.ltr.apply_learned_model()
transformer will try to warn you about this by raising a ValueError
with Expected X number of features, but found Y features.
In our experience, LightGBM tends to be more effective than xgBoost.
Similar to sklearn, both XGBoost and LightGBM provide feature importances via lmart_x.feature_importances_ and lmart_l.feature_importances_.
FastRank: Coordinate Ascent¶
We now support FastRank for learning models:
!pip install fastrank
import fastrank
train_request = fastrank.TrainRequest.coordinate_ascent()
params = train_request.params
params.init_random = True
params.normalize = True
params.seed = 1234567
ca_pipe = pipeline >> pt.ltr.apply_learned_model(train_request, form="fastrank")
ca_pipe.fit(train_topics, train_qrels)
FastRank provides two learners: a random forest implementation (fastrank.TrainRequest.random_forest()) and coordinate ascent (fastrank.TrainRequest.coordinate_ascent()), a linear model.
Working with Features¶
We provide additional transformations functions to aid the analysis of learned model, for instance, removing (ablating) features from a complex ranking pipeline.
- pyterrier.ltr.ablate_features(fids)[source]¶
Ablates features (sets feature value to 0) from a pipeline. This is useful for performing feature ablation studies, whereby a feature is removed from the pipeline before learning.
- Return type:
- Parameters:
fids – one or a list of integers corresponding to features indices to be removed
Example:
# assume pipeline is a retrieval pipeline that produces four ranking features
numf=4
rankers = []
names = []
# learn a model for all four features
full = pipeline >> pt.ltr.apply_learned_model(RandomForestRegressor(n_estimators=400))
full.fit(trainTopics, trainQrels, validTopics, validQrels)
rankers.append(full)
# learn a model for 3 features, removing one each time
for fid in range(numf):
ablated = pipeline >> pt.ltr.ablate_features(fid) >> pt.ltr.apply_learned_model(RandomForestRegressor(n_estimators=400))
ablated.fit(trainTopics, trainQrels, validTopics, validQrels)
rankers.append(ablated)
# evaluate the full (4 features) model, as well as the each model containing only 3 features)
pt.Experiment(
rankers,
test_topics,
test_qrels,
["map"],
names=["Full Model"] + ["Full Minus %d" % fid for fid in range(numf)
)
- pyterrier.ltr.keep_features(fids)[source]¶
Reduces the features in a pipeline to only those mentioned. This is useful for performing feature ablation studies, whereby only some features are kept (and other removed) from a pipeline before learning occurs.
- Return type:
- Parameters:
fids – one or a list of integers corresponding to the features indice to be kept
- pyterrier.ltr.feature_to_score(fid)[source]¶
Applies a specified feature for ranking. Useful for evaluating which of a number of pre-computed features are useful for ranking.
- Return type:
- Parameters:
fid – a single feature id that should be kept
- pyterrier.ltr.score_to_feature()[source]¶
Takes the document’s “score” from the score attribute, and uses it as a single feature. In particular, a feature union operator does not use any score of the documents in the candidate set as a ranking feaure. Using the resulting transformer within a feature-union means that an additional ranking feature is added to the “feature” column.
Example:
cands = pt.terrier.Retriever(index, wmodel="BM25") bm25f = pt.terrier.Retriever(index, wmodel="BM25F") pl2f = pt.terrier.Retriever(index, wmodel="PL2F") two_features = cands >> (bm25f ** pl2f) three_features = cands >> (bm25f ** pl2f ** pt.ltr.score_to_feature())
- Return type: