OntoTagMe API Documentation
OntoTagME is an Entity Linker that is built for working on biologically relevant texts. OntoTagME processes biological texts provided by the user and extracts a set of spots, linking them with the relevant page of WikiData. You can annotate snippets of various lengths of text.OntoTagME is also fully integrated with PubTator, a Named Entity Recognition tool specialized for biology. If you need to annotate a biological paper (identified by a PubMed ID or a PubMed Central ID), then OntoTagME will integrate its results with the ones from PubTator, thus increasing the quality of the annotation. You can query OntoTagME through REST APIs as described in the following.
Version
OntoTagME is using a biological subset of the English Wikidata dump, version: 2022-10.
Endpoint URL
https://ontotagme-entity-linker.d4science.org/
Registering to the service
The service is hosted by the D4Science Infrastructure. To obtain access you need to register to the TagMe VRE and get your authorization token by clicking on the "access the VRE". Now you have everything in place to issue a query to OntoTagME RESTful API.
How to Annotate
OntoTagME works in two different ways:
- PAPER QUERY: in this case, results from OntoTagME and PubTator are integrated together. If the paper is not present in PubTator, then the resulting annotations will be provided by OntoTagME only.
- TEXT QUERY: in this case, only OntoTagME is used as a backend
A generic annotation looks like the following.
{
"wid": Identifier of the entity,
"spot": textual fragment referring to the entity,
"Word": Title of the Wikidata page,
"categories": LIST of categories assigned to the entity,
"start_pos": index of the first character of the spot in the text,
"end_pos": index of the last character of the spot in the text,
"section": section where the entity was extracted,
"annotation_mode": info regarding which annotator found the annotation (PubTator or OntoTagME),
"wiki_url": url of the wikidata page regarding the entity, if applicable
}
Annotate by Article ID (PubMed and PubMed Central)
In order to get the annotations for a full text on PubMed Central or an abstract on PubMed, you can use the endpoint
https://ontotagme-entity-linker.d4science.org/annotate_by_id
The only supported query is POST.
The annotation will be performed by OntoTagME and enriched with PubTator.
Parameters
- a_id - required - the article ID to annotate. Formatted like PMCxxxxxx for full-text annotations, and xxxxx for annotating abstracts only.
- token - required - the TagME VRE authorization token, to get it click on the "access the VRE" button.
- mode - defaults to 'pmc' - in this field, specify 'pm' if you want to annotate abstracts, or 'pmc' if you want to annotate full-texts.
Example
Here is provided a simple Python script to test the "query by id" endpoint.
import requests
URL_ID = "https://ontotagme-sobigdata.d4science.org/annotate_by_id"
TOKEN = <"your_tagme_vre_token">
headers = {"Content-Type": "application/json", "gcube_token"=TOKEN}
def annotate_by_id(a_id, mode="pmc"):
payload = {"a_id": a_id, "mode": mode}
r = requests.post(URL_ID, headers=headers, json=payload)
if r.status_code != 200:
raise Exception("Error on article: {}\n{}".format(a_id, r.text))
return r.json()
print(annotate_by_id("PMC6982432"))
print(annotate_by_id("33403489", mode="pm"))
Annotate by Text
In order to get the annotations for a textual snippet, you can use the endpoint
https://ontotagme-entity-linker.d4science.org/annotate_by_text
The only supported query is POST.
The annotation will be performed by OntoTagME alone. In this case, no PubTator enrichment is performed
Parameters
- text - required - The snippet of text to annotate.
- token - required - the TagME VRE authorization token, to get it click on the "access the VRE" button.
Example
Here is provided a simple Python script to test the "query by text" endpoint.
import requests
URL_TEXT = " https://ontotagme-sobigdata.d4science.org/annotate_by_text"
TOKEN = <"your_tagme_vre_token">
headers = {"Content-Type": "application/json", "gcube_token"=TOKEN}
def annotate_by_text(text):
payload = {"text": text}
r = requests.post(URL_TEXT, headers=headers, json=payload)
if r.status_code != 200:
raise Exception("Error on article: {}\n{}".format(a_id, r.text))
return r.json()
print(annotate_by_text("Comparison with alkaline phosphatases and 5-nucleotidase"))
Credits and References
OntoTagME is a joint effort between the University of Pisa, Scuola Normale Superiore, and the University of Catania.
Its first version was presented in the Applied Network Science paper "NetME: On-the-fly knowledge network construction from biomedical literature", by Muscolino et. al.
Please cite our work if you decide to use OntoTagME for your research.