Extra Caching Utilities

These are additional utilities that can are sometimes useful when using the pyterrier-caching package.

class pyterrier_caching.Lazy(fn_transformer, *fn_args, **fn_kwargs)[source]

A Transformer that doesn’t initialize until it is used.

This is useful in cases where loading a transformer is lengthy or allocates resources that are not always necessary. For instance a cached neural neural scorer allocates GPU memory, but often isn’t needed when used with a ScorerCache.

Example:

Using a Lazy ElectraScorer with a ScorerCache.
from pyterrier_caching import ScorerCache
from pyterrier_dr import ElectraScorer
lazy_scorer = Lazy(ElectraScorer) # ElectraScorer not loaded yet
cached_scorer = ScorerCache('electra.cache', lazy_scorer)
cached_scorer([{
    'qid': '0',
    'query': 'terrier breeds',
    'docno': 'doc1',
    'text': 'There are many breeds of terriers, including the Scottish and Jack Russell Terrier.'
])
# ElectraScorer only loaded if ('0', 'doc1') is not yet in electra.cache
Parameters:
  • fn_transformer – A function that returns a transformer when called (or the transformer class itself).

  • fn_args – Positional arguments to pass to fn_transformer when loading it.

  • fn_kwargs – Keyword arguments to pass to fn_transformer when loading it.

load()[source]

Load the transformer if it isn’t already loaded, and return it.

Return type:

Transformer

unload()[source]

Unloads the transformer. Subsequent calls to load() will re-load it.

loaded()[source]

Return whether the transformer is currently loaded.

Return type:

bool

pyterrier_caching.closing_memmap(*args, **kwargs)[source]

A context manager that creates a numpy.memmap and closes it when the context is exited.

This allows numpy.memmap to be used as a context manager, since it doesn’t support the context manager protocol directly.

Parameters:
  • *args – Positional arguments to pass to numpy.memmap.

  • **kwargs – Keyword arguments to pass to numpy.memmap.

Example:

Using a closing_memmap() context manager.
from pyterrier_caching import closing_memmap
with closing_memmap('file.npy', dtype='float32', mode='w+', shape=(10, 10)) as mmp:
    # do what you want with mmp here!
# mmp is closed here