Semantic Scholar¶
Semantic Scholar is a search engine over academic papers provided by the Allen Institute for AI.
pyterrier-services
provides access to the Semantic Scholar search API through
SemanticScholarRetriever
.
Example:
>>> from pyterrier_services import SemanticScholar
>>> s2 = SemanticScholar()
>>> retr = s2.retriever(num_results=5)
>>> retr.search('pyterrier')
# qid query docno score rank title abstract
# 1 pyterrier 7fa92ed08eee68a945884b8744e7db9887aed9d3 0 0 PyTerrier: Declarative Experimentation in Pyth... PyTerrier is a Python-based retrieval framewor...
# 1 pyterrier a6b1126e058262c57d36012d0fdedc2417ad04e1 -1 1 Declarative Experimentation in Information Ret... The advent of deep machine learning platforms ...
# 1 pyterrier 833b453c621099bccca028752aaa74262123706a -2 2 PyTerrier-based Research Data Recommendations ... Research data is of high importance in scienti...
# 1 pyterrier 73feb5cfe491342d52d47e8817d113c072067306 -3 3 The Information Retrieval Experiment Platform We integrate irdatasets, ir_measures, and PyTe...
# 1 pyterrier 90b8a1adae2761e48c87fdeb68a595dc11161970 -4 4 QPPTK@TIREx: Simplified Query Performance Pred... We describe our software submission to the ECI...
- class pyterrier_services.SemanticScholarApi[source]¶
Represents a reference to the Semantic Scholar search API.
- retriever(*, num_results=100, fields=['title', 'abstract'], verbose=True)[source]¶
Returns a
Transformer
that retrieves articles from Semantic Scholar.- Return type:
- Parameters:
num_results – The number of results to retrieve. Defaults to 100.
fields – The fields to include in the retrieved results. Defaults to [‘title’, ‘abstract’].
verbose – Whether to log the progress. Defaults to True.
- search(query, *, offset=0, limit=100, fields=['title', 'abstract'], return_next=False, return_total=False)[source]¶
Searches for papers on Semantic Scholar with the provided query.
- Return type:
Union
[DataFrame
,Tuple
[DataFrame
,int
],Tuple
[DataFrame
,int
,int
]]- Parameters:
query – The search query.
offset – The offset of the first result to retrieve. Defaults to 0.
limit – The maximum number of results to retrieve. Defaults to 100.
fields – The fields to include in the retrieved results. Defaults to [‘title’, ‘abstract’].
return_next – Whether to return the next query URL. Defaults to False.
return_total – Whether to return the total number of results. Defaults to False.
- class pyterrier_services.SemanticScholarRetriever(*, api=None, num_results=100, fields=['title', 'abstract'], verbose=True)[source]¶
A
Transformer
retriever that queries the Semantic Scholar search API.- Parameters:
api – The Semantic Scholar api service. Defaults to a new instance of
SemanticScholarApi
.num_results – The number of results to retrieve per query. Defaults to 100.
fields – The fields to include in the retrieved results. Defaults to [‘title’, ‘abstract’].
verbose – Whether to log the progress. Defaults to True.