ai-center
latest
false
- Release Notes
- Getting started
- Notifications
- Projects
- Datasets
- Data Labeling
- ML packages
- Out of the box packages
- Pipelines
- ML Skills
- ML Logs
- Document UnderstandingTM in AI Center
- AI Center API
- Licensing
- AI Solutions Templates
- How to
- Use Custom NER with continuous learning
- Basic Troubleshooting Guide
Use Custom NER with continuous learning

AI Center
Last updated Apr 11, 2025
Use Custom NER with continuous learning
This example is used to extract chemicals by the category mentioned in research papers. By following this procedure you will extract the chemicals and categorize them as ABBREVIATION, FAMILY, FORMULA, IDENTIFIER, MULTIPLE, SYSTEMATIC, TRIVIAL and NO_CLASS.
This procedure uses the Custom Named Entity Recognition package. For more information on how this package works and what it can be used for, check the Custom Named Entity Recognition page.
For this procedure, we have provided sample files as follows:
- Pre-labeled training dataset in CoNLL format. You can download the training dataset from the following link: training dataset.
- Pre-labeled test dataset. You can download the test dataset from the following link: test dataset.
- Sample workflow for extracting categories of
chemicals mentioned in research papers. You can download it from the following link: sample workflow.
Note: Make sure that the following variables are filled in in the sample file:
in_emailAdress
- the email address to which the Action Center task will be assigned toin_MLSkillEndpoint
- public endpoint of the ML Skillin_MLSkillAPIKey
- API key of the ML Skillin_labelStudioEndpoint
- optional, to enable continuous labeling: provide import URL of a label studio project
Use the following steps to extract chemicals by their category
from research papers.
- Install Label Studio on your local machine or cloud instance. To do so, follow the instructions from the Label Studio page.
- Create a new project from the Named Entity
Recognition Template and define your Label Names.
- Make sure that the label names have no special characters or spaces. For example, instead of
Set Date
, useSetDate
. - Make sure that the value of the
<Text>
tag is"$text"
. - Upload the data using the API from the Label Studio API
page.
cURL request example:
curl --location --request POST 'https://<label-studio-instance>/api/projects/<id>/import' \)\) --header 'Content-Type: application/json' \)\) --header 'Authorization: Token <Token>' \)\) --data-raw '[ { "data": { "text": "<Text1>" }, }, { "data": { "text": "<Text2>" } } ]'
curl --location --request POST 'https://<label-studio-instance>/api/projects/<id>/import' \)\) --header 'Content-Type: application/json' \)\) --header 'Authorization: Token <Token>' \)\) --data-raw '[ { "data": { "text": "<Text1>" }, }, { "data": { "text": "<Text2>" } } ]' - Annotate your data.
- Export the data in CoNLL 2003 format and upload it to AI Center.
- Provided the Label Studio instance URL and API key in the provided sample workflow in order to capture incorrect and low confidence predictions.