pt.inspect - Inspecting Live Objects¶
This module provides useful utility methods for getting information about PyTerrier objects.
Note
This is an advanced module that is not typically used by end users.
- pyterrier.inspect.artifact_type_format(artifact, *, strict=True)[source]¶
Returns the type and format of the specified artifact.
These values are sourced by either the
ARTIFACT_TYPEandARTIFACT_FORMATconstants of the artifact, or (if these are not available) by matching on the entry points.- Return type:
Tuple[str,str] |None- Parameters:
artifact (Type | Artifact) – The artifact to inspect.
strict (bool) – If True, raises an error if the artifact’s type or format could not be determined.
- Returns:
A tuple containing the artifact’s type and format, or
Noneif the type and format could not be determined andstrict==False.- Raises:
InspectError – If the artifact’s type or format could not be determined and
strict==True
- pyterrier.inspect.transformer_inputs(transformer, *, strict=True)[source]¶
Infers supported input column configurations (a
List[List[str]]) for a transformer.The method tries to infer the input columns that the transformer accepts by calling it with an empty DataFrame and inspecting a resulting
pt.validate.InputValidationError. If the transformer does not raise an error, it tries to infer the input columns by calling it with a pre-defined set of input columns.To handle edge cases, you can implement the
HasTransformInputsprotocol, which allows you to define a customtransform_inputsmethod that returns a list of input column configurations accepted by the transformer.transform_inputscan also be an attribute instead of a method. In this case, it can be a list of lists of input columns (i.e., a list of valid input column configurations). Note thattransform_inputsis allowed to return aList[str]. If this is the case, it is converted to aList[List[str]]automatically.The list of input specifications is assumed to be prioritized. For instance, schematics will show the first valid specification when multiple are valid for the pipeline.
- Return type:
List[List[str]] |None- Parameters:
transformer (Transformer) – An instance of the transformer to inspect.
strict (bool) – If True, raises an error if the transformer cannot be inferred or are not accepted. If False, returns None in these cases.
- Returns:
A list of input column configurations (
List[List[str]]) accepted by this transformer.- Raises:
InspectError – If the transformer cannot be inspected and
strict==True.
- pyterrier.inspect.transformer_outputs(transformer, input_columns, *, strict=True)[source]¶
Infers the output columns for a transformer based on the provided input columns.
If the transformer implements the
HasTransformOutputsprotocol, the method calls itstransform_outputsmethod to determine the output columns. If the transformer does not implement the protocol, it attempts to infer the output columns by calling the transformer with an empty DataFrame.- Return type:
List[str] |None- Parameters:
transformer (Transformer) – An instance of the transformer to inspect.
input_columns (List[str]) – A list of the columns present in the input frame.
strict (bool) – If True, raises an error if the transformer cannot be inferred or are not accepted. If False, returns None in these cases.
- Returns:
A list of the columns present in the output for
transformergiveninput_columns.- Raises:
InspectError – If the transformer’s outputs could not be determined and
strict==True.pt.validate.InputValidationError – If input validation fails in the transformer and
strict==True.
- pyterrier.inspect.transformer_attributes(transformer, *, strict=True)[source]¶
Infers a list of attributes of the transformer.
Here, an attribute is defined as any attribute of the transformer that is explicitly set by the
__init__method, either under the same name (e.g.,self.foo = foo) or as a private attribute (e.g.,self._foo = foo).This definition allows for a set of attributes that should describe the state of a transformer. These attributes can be used to reconstruct the transformer from its attributes, e.g., by calling
transformer_apply_attributes().To handle edge cases (e.g., where the
__init__parameters do not match the attribute names), you can implement theHasAttributesprotocol.- Return type:
List[TransformerAttribute]- Parameters:
transformer (Transformer) – The transformer to inspect.
strict (bool) – If True, raises an error if an attribute cannot be identified from the transformer. If False, the attribute’s value is set to
TransformerAttribute.MISSINGin these cases.
- Returns:
A list of
TransformerAttributeobjects representing the attributes of the transformer.- Raises:
InspectError – If the attributes cannot be identified from the transformer.
- pyterrier.inspect.transformer_apply_attributes(transformer, **kwargs)[source]¶
Returns a new transformer instance from the provided transformer and updated attributes (as keyword arguments).
This method is useful for constructing new transformer with some attributes replaced. For instance, when implemeting methods like
fuse_rank_cutoff(), you frequently need to replace thenum_resultsattribute of a transformer with a new value while keeping the remainder of the attributes the same.This method uses
transformer_attributes()to identify the attributes of the transformer and then applies the provided keyword arguments to the transformer attributes. The method then reconstructs the transformer by calling its__init__method with the updated attributes.To handle edge cases (e.g., where the
__init__parameters do not match the attribute names), you can implement theHasApplyAttributesprotocol.- Return type:
- Parameters:
transformer (Transformer) – The transformer to apply the attributes to.
**kwargs (Any) – Keyword arguments representing the attributes to set on the transformer.
- Returns:
A new instance of the transformer with the provided attributes applied.
- Raises:
InspectError – If an attribute is not found in the transformer or if attributes cannot be identified from the transformer.
- pyterrier.inspect.subtransformers(transformer)[source]¶
Infers a dictionary of subtransformers for the given transformer.
A subtransformer is a transformer that is used by another transformer to complete its task. Examples include those used by caches (e.g.,
scorerinpyterrier_caching.ScorerCache) and the list of transformers that are used by apyterrier_alpha.fusion.RRFusiontransformer.If the transformer implements the
HasSubtransformersprotocol, the method calls itssubtransformersmethod to retrieve the subtransformers. If the transformer does not implement the protocol, the method inspects the transformer to identify any attributes of a transformer that are instance of pt.Transformer (or list/tuple of Transformer), returning a dictionary where the keys where the keys are the names of the subtransformers and the values are the subtransformers themselves. If the transformer does not have any subtransformers, an empty dictionary is returned.- Return type:
Dict[str,Transformer|List[Transformer]]- Parameters:
transformer (Transformer) – The transformer to inspect.
- Returns:
A dictionary of the provided transformer’s subtransformers.
- Raises:
InspectError – If the subtransformers cannot be identified from the transformer.
- class pyterrier.inspect.TransformerAttribute(name, value, init_default_value, init_parameter_kind=None)[source]¶
A dataclass representing an attribute of a transformer.
- Parameters:
name (str)
value (Any)
init_default_value (Any)
init_parameter_kind (_ParameterKind | None)
- name¶
The name of the attribute.
- value¶
The value of the attribute.
- init_default_value¶
The default value of the attribute for the
__init__method (if available) orinspect.Parameter.emptyif not available.
- init_parameter_kind¶
The kind of the parameter in the
__init__method (if available) orNoneif not available.
- class pyterrier.inspect.HasTransformInputs(*args, **kwargs)[source]¶
Protocol for transformers that provide a
transform_inputsmethod.transform_inputsallows for inspection of the inputs accepted by transformers without needing to run it.When this method is present in a
Transformerobject, it must return either:A list of lists of input columns (i.e., a list of valid input column configurations)
A list of input columns (i.e., a single valid input column configuration)
If the input columns of the transformer do not depend on the instance,
transform_inputscan also be an attribute with a value of typeList[str]orList[List[str]].If
transform_inputs is None, it is ignored.This method need not be present in a Transformer class - it is an optional extension; an alternative is that the input columns are determined by calling the transformer with an empty
DataFrame.Exampletransform_inputsfunction, implementingHasTransformInputs.¶class MyRetriever(pt.Transformer): def transform(self, inp: pd.DataFrame) -> pd.DataFrame: pt.validate.query_frame(inp, ['query']) # ... perform retrieval ... # return the same columns as inp plus docno, score, and rank. E.g., using DataFrameBuilder. def transform_inputs(self) -> Union[List[str], List[List[str]]]: # NOTE: This method isn't required in this case, since inspect will be able to infer required # columns from pt.validate. It's just a demonstration. return ['qid', 'query']
- class pyterrier.inspect.HasTransformOutputs(*args, **kwargs)[source]¶
Protocol for transformers that provide a
transform_outputsmethod.transform_outputsallows for inspection of the outputs of transformers without needing to run it.When this method is present in a
Transformerobject, it must return a list of the output columns present given the provided input columns or raise anInputValidationErrorif the inputs are not accepted by the transformer.This method need not be present in a Transformer class - it is an optional extension; an alternative is that the output columns are determined by calling the transformer with an empty
DataFrame.Due to risks and maintanence burden in ensuring that
transformandtransform_outputsbehave identically, it is recommended to only implementtransform_outputswhen calling the transformer with an empty DataFrame to inspect the behavior is undesireable, e.g., if calling the transformer is expensive.Exampletransform_outputsfunction, implementingHasTransformOutputs.¶class MyRetriever(pt.Transformer): def transform(self, inp: pd.DataFrame) -> pd.DataFrame: pt.validate.query_frame(inp, ['query']) # ... perform retrieval ... # return the same columns as inp plus docno, score, and rank. E.g., using DataFrameBuilder. def transform_outputs(self, input_columns: List[str]) -> List[str]: pt.validate.query_frame(input_columns, ['query']) return input_columns + ['docno', 'score', 'rank']
- transform_outputs(input_columns)[source]¶
Returns a list of the output columns present given the
input_columns.The method must return exactly the same output columns as
transformwould given the provided input columns. If the input columns are not accepted by the transformer, the method should raise anInputValidationError(e.g., throughpt.validate).- Return type:
List[str]- Parameters:
input_columns (List[str]) – A list of the columns present in the input frame.
- Returns:
A list of the columns present in the output for this transformer given
input_columns.- Raises:
pt.validate.InputValidationError – If the input columns are not accepted by the transformer.
pt.inspect.InspectError – If the transformer is uninspectable.
- class pyterrier.inspect.HasAttributes(*args, **kwargs)[source]¶
Protocol for transformers that provide an
attributesmethod.attributesallows for identifying the attributes of a transformer without needing to traverse its attributes manually.When this method is present in a
Transformerobject, it must return a list ofTransformerAttributeobjects, where each object represents an attribute of the transformer and corresponding metadata about how the attribute is assigned.This method need not be present in a Transformer class - it is an optional extension.
- attributes()[source]¶
Returns a list of attributes of the transformer.
- Return type:
List[TransformerAttribute]
- class pyterrier.inspect.HasApplyAttributes(*args, **kwargs)[source]¶
Protocol for transformers that provide an
apply_attributesmethod.apply_attributesreturns a new transformer with updated attributes (as keyword arguments).This method need not be present in a Transformer class - it is an optional extension.
- class pyterrier.inspect.HasSubtransformers(*args, **kwargs)[source]¶
Protocol for transformers that provide a
subtransformersmethod.subtransformersallows for identifying subtransformers of a transformer without needing to traverse its attributes manually.When this method is present in a
Transformerobject, it must return a dict where the keys are the names of the subtransformers and the values are the subtransformers (or list of subtransformers) themselves.This method need not be present in a Transformer class - it is an optional extension. See
pyterrier.inspect.subtransformers()for the default implementation.- subtransformers()[source]¶
Returns a dictionary of subtransformers for the transformer.
The method must return a dictionary where the keys are the names of the subtransformers and the values are the subtransformers themselves. If the transformer does not have any subtransformers, an empty dictionary should be returned.
- Return type:
Dict[str,Transformer|List[Transformer]]