Readers¶
Generic Reader¶
- class pyterrier_rag.readers.Reader(backend, prompt='Use the context information to answer the Question: \\n Context: {{ qcontext }} \\n Question: {{ query }} \\n Answer:', output_field='qanswer')[source]¶
Bases:
TransformerTransformer that generates answers from context and queries using an LLM backend.
Combines a PromptTransformer with a Backend to produce text or logprobs, then applies answer extraction to return final responses.
- Parameters:
backend (Backend or str) – A Backend instance or model identifier string.
prompt (PromptTransformer or str) – Prompt template or raw instruction.
output_field (str) – Field name in the output DataFrame for answers.
- Raises:
ValueError – If the prompt expects logprobs but the backend does not support logprobs.
Example using a local LLM:
from pyterrier_rag.backend import Seq2SeqLMBackend from pyterrier_rag.prompt import Concatenator from pyterrier_rag.readers import Reader flant5 = Reader(Seq2SeqLMBackend('google/flan-t5-base')) bm25_flant5 = bm25_ret % 10 >> Concatenator() >> flant5 bm25_flant5.search("What is the capital of France?")
Example using a remote LLM:
from pyterrier_rag.backend import OpenAIBackend from pyterrier_rag.prompt import Concatenator from pyterrier_rag.readers import Reader llamma = Reader(OpenAIBackend("llama-3-8b-instruct", api_key="your_api_key", base_url="your_base_url")) bm25_llamma = bm25_ret % 10 >> Concatenator() >> llamma bm25_llamma.search("What is the capital of Italy?")
Specific Readers¶
- class pyterrier_rag.readers.T5FiD(model_name_or_path, tokenizer_name_or_path=None, batch_size=4, text_field='text', text_max_length=256, num_context='auto', max_new_tokens=32, generation_config=None, verbose=False, device=None, **kwargs)[source]¶
T5 FiD Reader for PyTerrier-RAG
Error
Failed to fetch BibTeX for DBLP ID 'conf/eacl/IzacardG21': ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))- Parameters:
model_name_or_path (str)
tokenizer_name_or_path (str)
batch_size (int)
text_field (str)
text_max_length (int)
num_context (int | str)
max_new_tokens (int)
generation_config (GenerationConfig)
verbose (bool)
device (str | device)
- class pyterrier_rag.readers.BARTFiD(model_name_or_path, tokenizer_name_or_path=None, batch_size=4, text_field='text', text_max_length=256, num_context='auto', max_new_tokens=32, generation_config=None, verbose=False, device=None, **kwargs)[source]¶
BART FiD Reader for PyTerrier-RAG
Error
Failed to fetch BibTeX for DBLP ID 'conf/eacl/IzacardG21': ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))- Parameters:
model_name_or_path (str)
tokenizer_name_or_path (str)
batch_size (int)
text_field (str)
text_max_length (int)
num_context (int | str)
max_new_tokens (int)
generation_config (GenerationConfig)
verbose (bool)
device (str | device)