Writing Custom Transformers ===================================== .. note:: This page is a work in progress. Pipeline Optimization ------------------------------------- Pipelines can be optimized using :meth:`pyterrier.Transformer.compile`. You can implement your own optimizations by overriding this method. For instance, a pseudo-relevance feedback method that only uses the top ``fb_docs`` documents per query can re-write itself with a preceding :class:`~pyterrier.RankCutoff` transformer, as follows: .. code-block:: python :caption: Optimizing a pseudo-relevance feedback transformer by implementing ``compile()``. class MyPrf(pt.Transformer): ... def compile(self) -> pt.Transformer: return pt.RankCutoff(self.fb_docs) >> self Why is this helpful? :class:`~pyterrier.RankCutoff` knows it can combine ("fuse") itself with any preceeding transformers that are able to reduce computation by knowing how many documents are required by the subsequent step. For instance, most retrievers can reduce computaional cost by reducing the top ``k`` documents retrieved per query. This functionality is faciliated through the :class:`~pyterrier.transformer.SupportsFuseRankCutoff` protocol, which defines the :meth:`~pyterrier.transformer.SupportsFuseRankCutoff.fuse_rank_cutoff` method. You can choose to implement this method if your transformer can benefit from being combined with a :class:`~pyterrier.RankCutoff` transformer. .. code-block:: python :caption: Implementing ``fuse_rank_cutoff`` to allow combining with ``RankCutoff``. class MyRetriever(pt.Transformer): ... def fuse_rank_cutoff(self, k: int) -> Optional[pt.Transformer]: if self.num_results > k: return pt.inspect.transformer_apply_attributes(self, num_results=k) .. hint:: :meth:`~pyterrier.inspect.transformer_apply_attributes` lets you easily construct a new transformer with some attributes replaced (here, ``num_results``). This can be expecially handy when your transformer has a lot of attributes. .. caution:: The result of fusion methods should be *functionally equivalent* to the original transformer. If the ``if self.num_results > k:`` condition above was not applied, it would behave differently when ``num_results` allows users to gather information about live transformer objects, for instance input/output specifications. This can be useful for things like pipeline validation or or drawing schematic diagrams of pipelines. Default implementations for these methods usually work well, but sometimes you may need to override them to handle idiosyncratic cases. You can override the behavior of the following methods by implementing python `Protocols `__ (in these cases, it's just adding a method with a specific signature that implements the same functionality). +---------------------------------------------------------+--------------------------------------------------------------------+ | Override... | By implementing... | +=========================================================+====================================================================+ | :meth:`pyterrier.inspect.transformer_inputs` | :class:`~pyterrier.inspect.HasTransformInputs.transform_inputs` | +---------------------------------------------------------+--------------------------------------------------------------------+ | :meth:`pyterrier.inspect.transformer_outputs` | :class:`~pyterrier.inspect.HasTransformOutputs.transform_outputs` | +---------------------------------------------------------+--------------------------------------------------------------------+ | :meth:`pyterrier.inspect.transformer_attributes` | :class:`~pyterrier.inspect.HasAttributes.attributes` | +---------------------------------------------------------+--------------------------------------------------------------------+ | :meth:`pyterrier.inspect.transformer_apply_attributes` | :class:`~pyterrier.inspect.HasApplyAttributes.apply_attributes` | +---------------------------------------------------------+--------------------------------------------------------------------+ | :meth:`pyterrier.inspect.subtransformers` | :class:`~pyterrier.inspect.HasSubtransformers.subtransformers` | +---------------------------------------------------------+--------------------------------------------------------------------+