REL API Documentation

From the GitHub page of REL:

REL has various meanings - one might first notice that it stands for relation, which is a suiting name for the problems that can be tackled with this package. Additionally, in Dutch a 'rel' means a disturbance of the public order, which is exactly what we aim to achieve with the release of this package.


REL is an entity linker, namely a tool that identifies meaningful substrings (called "spots") in an unstructured English text and link each of them to the unambiguous entity (an item in a knowledge base). Entities are Wikipedia/Wikidata items. This has applications in a range of NLP/NLU problems such as question answering, knowledge base population, text classification, etc. You can annotate a text by issuing a query to the RESTful API documented on this page.

Parameters can be passed to the API endpoints as fields of a POST request. All endpoints process HTTP POST request only.

Version

REL English Wikipedia dump version: 2021-03

Please note: the results obtained by this API may not necessarily be in line with what results reported in the REL paper and Github repository (due to the change of Wikipedia version). We refer users to the official API for getting the accurate results for Wikipedia 2019.

Registering to the service

The service is hosted by the D4Science Infrastructure. To obtain access you need to register to the TagMe VRE and get your authorization token by clicking on the Show button in the left panel. Now you have everything in place to issue a query to REL RESTful api.

For example, you can execute on your terminal a curl POST request:

curl --location --request POST 'https://rel-entity-linker.d4science.org/' \
--header 'gcube-token: XXXX' \
--header 'Content-Type: application/json' \
--data-raw '{
    "text" : "Schumacher won the race in Indianapolis"
}'

(Replace XXXX with your actual Service Authorization Token)

Congratulations! You have made your first request to REL.

How to annotate

REL utilizes English Wikipedia as a knowledge base and can be used for the following tasks:

  • Entity linking (EL): Given a text, the system outputs a list of mention-entity pairs, where each mention is a n-gram from text and each entity is an entity in the knowledge base.
  • Entity Disambiguation (ED): Given a text and a list of mentions, the system assigns an entity (or NIL) to each mention.

Endpoint URL

https://rel-entity-linker.d4science.org/

Parameters

  • text - required - the text to be annotated, using UTF-8 encoding.

  • gcube-token - required - the D4Science Service Authorization Token.

  • spans - optional

    • For EL: the spans field needs to be set to an empty list.
    • For ED: spans should consist of a list of tuples, where each tuple refers to the start position and length of a mention.

Examples

EL:

gcube-token=<your Service Authorization Token>
text=Schumacher won the race in Indianapolis
curl --location --request POST 'https://rel-entity-linker.d4science.org/' \
--header 'gcube-token: XXXX' \
--header 'Content-Type: application/json' \
--data-raw '{
    "text" : "Schumacher won the race in Indianapolis"
}'

ED:

gcube-token=<your Service Authorization Token>
text=Schumacher won the race in Indianapolis
spans=[[0, 10]]
curl --location --request POST 'https://rel-entity-linker.d4science.org/' \
--header 'gcube-token: XXXX' \
--header 'Content-Type: application/json' \
--data-raw '{
    "text" : "Schumacher won the race in Indianapolis" ,
    "spans" : [[0, 10]]
}'

Python code example

This is a working piece of Python 3.8 code that queries REL:

import json
import requests


IP_ADDRESS = "https://rel-entity-linker.d4science.org/"
MY_GCUBE_TOKEN = 'copy your gcube-token here!'

document = {
    "text": "Schumacher won the race in Indianapolis",
    "spans": []
}

API_result = requests.post("{}".format(IP_ADDRESS), data=json.dumps(document),
                            headers={'gcube-token': MY_GCUBE_TOKEN, 'Content-Type': 'application/json'})

if API_result.status_code == 200:
    print(API_result.json())
else:
    print(API_result.status_code)

Analysis of the POST response

If the returned status code is 200, you will reaceive an array of mentions.

Example:

[[0, 10, 'Schumacher', 'Michael_Schumacher', 0.9985688878422279, 0.9996414184570312, 'PER'], [27, 12, 'Indianapolis', 'Indianapolis', 0.9986363930690854, 0.9992759823799133, 'LOC']]

For each mention you will find an array with seven element:

  • The first is the starting position of the mention;
  • The second is the length of the mention;
  • The third is the n-gram from the text;
  • The fourth is the Wikipedia entity found by REL;
  • The fifth is the confidence of the entity disambiguation;
  • The sixth is the confidence of the mention detection;
  • The seventh is the entity group (or tag) given by the Named Entity Recognition.

Example:

[0, 10, 'Schumacher', 'Michael_Schumacher', 0.9985688878422279, 0.9996414184570312, 'PER']
  1. start_pos = 0
  2. mention_length = 10
  3. n-gram from the text = Schumacher
  4. wikipedia entity = Michael_Schumacher
  5. confidence of the entity disambiguation = 0.9985688878422279
  6. confidence of the mention detection = 0.9996414184570312
  7. tag = PER

Both confidences (entity disambiguation and mention detection) are double values between 0 and 1 (with 0 the smallest and 1 the highest).

The entity groups (or tags) may be the following:

  • PER (person name);
  • LOC (location name);
  • ORG (organization name);
  • MISC (other name);
  • NULL.

For more information look at the flair ner-english Hugging Face documentation.

In the case of an Entity Disambiguation post call to the REL API, the confidence of the mention detection will be 0.0.

HTTP Errors

  • 501 (NOT IMPLEMENTED) - The resource you requested is not a valid REL service.
  • 401 (UNAUTHORIZED) - You haven't provided a Service Authorization Token or it is not valid.
  • 400 (BAD REQUEST) - There are issues with the parameters you have sent (or not sent). Check the response message for details.

Credits and References

NLP-progress website: http://nlpprogress.com/english/entity_linking.html

Paper REL: https://arxiv.org/pdf/2006.01969.pdf

GitHub REL: https://github.com/informagi/REL

@inproceedings{vanHulst:2020:REL,
author =    {van Hulst, Johannes M. and Hasibi, Faegheh and Dercksen, Koen and Balog, Krisztian and de Vries, Arjen P.},
title =     {REL: An Entity Linker Standing on the Shoulders of Giants},
booktitle = {Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
series =    {SIGIR '20},
year =      {2020},
publisher = {ACM}
}

Access the TagMe VRE with your SoBigData or D4Science credentials.