Pinecone

Pinecone provides a Hosted Inference API to various embedding and reranking models. pyterrier-services provides access to these APIs through PineconeApi.

Note

To use this API, you will need to have the pinecone package installed (pip install pinecone) and have a Pinecone API Key. You can provide your API key through the environment variable PINECONE_API_KEY (preferred), or pass it to the constructor of PineconeApi.

Examples

Learned Sparse

Indexing and retrieval with a Pinecone learned sparse model using pyterrier_pisa.PisaIndex
# Setup
>>> from pyterrier_services import PineconeApi
>>> from pyterrier_pisa import PisaIndex
>>> pinecone = PineconeApi()
>>> model = pinecone.sparse_model()
>>> index = PisaIndex('my_index.pisa', stemmer='none')

# Indexing
>>> pipeline = model >> index
>>> pipeline.index([
...   {'docno': 'doc1', 'text': 'PyTerrier: Declarative Experimentation in Python from BM25 to Dense Retrieval'},
...   {'docno': 'doc2', 'text': 'QPPTK@TIREx: Simplified Query Performance Prediction for Ad-Hoc Retrieval Experiments'},
... ])

# Retrieval
>>> pipeline = model >> index.quantized()
>>> pipeline.search('pyterrier')
  qid      query          query_toks docno    score  rank
0   1  Retrieval  {'retrieval': 1.0}  doc2  30900.0     0
1   1  Retrieval  {'retrieval': 1.0}  doc1  29400.0     1

Dense

Indexing and retrieval with a Pinecone dense model using pyterrier_dr.FlexIndex
# Setup
>>> from pyterrier_services import PineconeApi
>>> from pyterrier_dr import FlexIndex
>>> pinecone = PineconeApi()
>>> model = pinecone.dense_model()
>>> index = FlexIndex('my_index.flex')

# Indexing
>>> pipeline = model >> index
>>> pipeline.index([
...   {'docno': 'doc1', 'text': 'PyTerrier: Declarative Experimentation in Python from BM25 to Dense Retrieval'},
...   {'docno': 'doc2', 'text': 'QPPTK@TIREx: Simplified Query Performance Prediction for Ad-Hoc Retrieval Experiments'},
... ])

# Retrieval
>>> pipeline = model >> index.retriever()
>>> pipeline.search('pyterrier')
  qid      query                                          query_vec docno  docid     score  rank
0   1  pyterrier  [0.00923919677734375, -0.0171356201171875, -0....  doc1      0  0.814679     0
1   1  pyterrier  [0.00923919677734375, -0.0171356201171875, -0....  doc2      1  0.722664     1

Re-Ranking

Re-Ranking results with Pinecone
>>> import pandas as pd
>>> from pyterrier_services import PineconeApi
>>> pinecone = PineconeApi()
>>> model = pinecone.reranker()
>>> model(pd.DataFrame([
...   {'qid': '1', 'query': 'retrieval', 'docno': 'doc1', 'text': 'PyTerrier: Declarative Experimentation in Python from BM25 to Dense Retrieval'},
...    {'qid': '1', 'query': 'retrieval', 'docno': 'doc2', 'text': 'QPPTK@TIREx: Simplified Query Performance Prediction for Ad-Hoc Retrieval Experiments'},
]))
  qid      query docno                                               text     score  rank
0   1  retrieval  doc2  QPPTK@TIREx: Simplified Query Performance Pred...  0.004811     0
1   1  retrieval  doc1  PyTerrier: Declarative Experimentation in Pyth...  0.001598     1

API Documentation

class pyterrier_services.PineconeApi(api_key=None)[source]

Represents a reference to the Pinecone API.

This class wraps pinecone.Pinecone.

Parameters:

api_key (str, optional) – The Pinecone API key. Defaults to the value from PINECONE_API_KEY.

dense_model(model_name='multilingual-e5-large')[source]

Creates a PineconeDenseModel instance.

Return type:

PineconeDenseModel

Parameters:

model_name (str) – The name of the model. See the list of supported models.

sparse_model(model_name='pinecone-sparse-english-v0')[source]

Creates a PineconeSparseModel instance.

Return type:

PineconeSparseModel

Parameters:

model_name (str) – The name of the model. See the list of supported models.

reranker(model_name='pinecone-rerank-v0')[source]

Creates a PineconeReranker instance.

Return type:

PineconeReranker

Parameters:

model_name (str) – The name of the model. See the list of supported models.

class pyterrier_services.PineconeSparseModel(model_name='pinecone-sparse-english-v0', *, api=None)[source]

A PyTerrier transformer that provies access to a Pinecone sparse model.

Parameters:
transform(inp)[source]

Encodes either queries or documents using this model (based on input columns)

Return type:

DataFrame

query_encoder()[source]

Creates a transformer that encodes queries using this model.

Return type:

PineconeSparseEncoder

doc_encoder()[source]

Creates a transformer that encodes documents using this model.

Return type:

PineconeSparseEncoder

scorer()[source]

Creates a transformer that scores (re-ranks) results using this model.

Return type:

PineconeSparseScorer

class pyterrier_services.PineconeDenseModel(model_name='multilingual-e5-large', *, api=None)[source]

A PyTerrier transformer that provides access to a Pinecone dense model.

Parameters:
transform(inp)[source]

Encodes either queries or documents using this model (based on input columns)

Return type:

DataFrame

query_encoder()[source]

Creates a transformer that encodes queries using this model.

Return type:

PineconeDenseEncoder

doc_encoder()[source]

Creates a transformer that encodes documents using this model.

Return type:

PineconeDenseEncoder

scorer()[source]

Creates a transformer that scores (re-ranks) results using this model.

Return type:

PineconeDenseScorer

class pyterrier_services.PineconeReranker(model_name='pinecone-rerank-v0', *, api=None)[source]

A PyTerrier transformer that provies access to a Pinecone reranker model.

Parameters: