communications-mining
latest
false
- API docs
- CLI
- Integration guides
- Blog
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining and Google AutoML for conversational data intelligence
Communications Mining Developer Guide
Last updated Oct 3, 2024
Datasets
/api/v1/datasets
Permissions required: View labels
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "datasets": [ { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "18ba5ce699f8da1f", "last_modified": "2018-10-15T15:48:49.603000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Example Dataset" } ], "status": "ok" }
{ "datasets": [ { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "18ba5ce699f8da1f", "last_modified": "2018-10-15T15:48:49.603000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Example Dataset" } ], "status": "ok" }
/api/v1/datasets/<project>/<dataset_name>
Permissions required: View labels
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "dataset": { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "18ba5ce699f8da1f", "last_modified": "2018-10-15T15:48:49.603000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Example Dataset" }, "status": "ok" }
{ "dataset": { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "18ba5ce699f8da1f", "last_modified": "2018-10-15T15:48:49.603000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Example Dataset" }, "status": "ok" }
/api/v1/datasets/<project>/<dataset>/model-tags
Permissions required: Model Admin
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/model-tags' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/model-tags' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/model-tags", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/model-tags", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/<project>/model-tags", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/<project>/model-tags", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "model_tags": [ { "name": "prod", "updated_at": "2021-11-16T12:31:00.123Z", "version": 5 }, { "name": "staging", "updated_at": "2021-11-15T12:30:00.123Z", "version": 7 } ], "status": "ok" }
{ "model_tags": [ { "name": "prod", "updated_at": "2021-11-16T12:31:00.123Z", "version": 5 }, { "name": "staging", "updated_at": "2021-11-15T12:30:00.123Z", "version": 7 } ], "status": "ok" }
/api/v1/datasets/<project>/<dataset>
Permissions required: Datasets admin
- Bash
curl -X PUT 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "description": "An optional long form description.", "model_family": "english", "source_ids": [ "18ba5ce699f8da1f" ], "title": "An Example Dataset" } }'
curl -X PUT 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "description": "An optional long form description.", "model_family": "english", "source_ids": [ "18ba5ce699f8da1f" ], "title": "An Example Dataset" } }' - Node
const request = require("request"); request.put( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { description: "An optional long form description.", model_family: "english", source_ids: ["18ba5ce699f8da1f"], title: "An Example Dataset", }, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.put( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { description: "An optional long form description.", model_family: "english", source_ids: ["18ba5ce699f8da1f"], title: "An Example Dataset", }, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.put( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "dataset": { "title": "An Example Dataset", "description": "An optional long form description.", "source_ids": ["18ba5ce699f8da1f"], "model_family": "english", } }, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.put( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "dataset": { "title": "An Example Dataset", "description": "An optional long form description.", "source_ids": ["18ba5ce699f8da1f"], "model_family": "english", } }, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "dataset": { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b9a1fd75f6133bce", "last_modified": "2018-10-15T15:48:49.603000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Example Dataset" }, "status": "ok" }
{ "dataset": { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b9a1fd75f6133bce", "last_modified": "2018-10-15T15:48:49.603000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Example Dataset" }, "status": "ok" }
NAME | TYPE | REQUIRED | DESCRIPTION |
---|---|---|---|
title | string | no | One-line human-readable title for the dataset. |
description | string | no | A longer description of the dataset. |
source_ids | array<string> | no | An array of source ids to be included in this dataset. |
model_family | string | no | Dataset model family, can be english or multilingual. Defaults to english. See here the languages that are supported by multilingual model family. |
has_sentiment | boolean | no | Whether labels in the dataset should be applied with sentiment. Defaults to true. |
/api/v1/datasets/<project>/<dataset>
Permissions required: Datasets admin
- Bash
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "title": "An Alternative Title" } }'
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "title": "An Alternative Title" } }' - Node
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { title: "An Alternative Title" } }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { title: "An Alternative Title" } }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={"dataset": {"title": "An Alternative Title"}}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={"dataset": {"title": "An Alternative Title"}}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "dataset": { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b9a1fd75f6133bce", "last_modified": "2018-10-15T15:53:08.479000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Alternative Title" }, "status": "ok" }
{ "dataset": { "created": "2018-10-15T15:48:49.603000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b9a1fd75f6133bce", "last_modified": "2018-10-15T15:53:08.479000Z", "model_family": "english", "name": "example", "owner": "<project>", "source_ids": ["18ba5ce699f8da1f"], "title": "An Alternative Title" }, "status": "ok" }
NAME | TYPE | REQUIRED | DESCRIPTION |
---|---|---|---|
title | string | no | One-line human-readable title for the dataset. |
description | string | no | A longer description of the dataset. |
source_ids | array<string> | no | An array of source ids to be included in this dataset. |
/api/v1/datasets/<project>/<dataset_name>
Permissions required: Datasets admin
- Bash
curl -X DELETE 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X DELETE 'https://<my_api_endpoint>/api/v1/datasets/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.delete( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.delete( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.delete( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.delete( "https://<my_api_endpoint>/api/v1/datasets/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "status": "ok" }
{ "status": "ok" }
/api/v1/datasets/<project>/<dataset_name>/export
Permissions required: Export datasets
- Bash
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/example/export' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "limit": 1 }'
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/example/export' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "limit": 1 }' - Node
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example/export", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { limit: 1 }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/example/export", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { limit: 1 }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/example/export", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={"limit": 1}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/example/export", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={"limit": 1}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "comments": [ { "annotations": { "labels": { "assigned": [ { "name": "Parent Label", "sentiment": "positive" }, { "name": "Parent Label > Child Label", "sentiment": "positive" } ] } }, "comment": { "context": "1596721237668", "created_at": "2020-08-06T13:20:28.531000Z", "has_annotations": true, "id": "0123456789abcdef", "last_modified": "2020-08-06T13:40:37.668000Z", "messages": [ { "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, "from": "bob@organisation.org", "sent_at": "2011-12-11T11:05:10Z", "subject": { "text": "Today's figures" }, "to": ["alice@company.com"] } ], "source_id": "47194279497e141e", "text_format": "plain", "thread_id": "123456", "timestamp": "2011-12-11T11:05:10Z", "uid": "47194279497e141e.0123456789abcdef", "user_properties": { "string:Recipient Domain": "company.com", "string:Sender Domain": "organisation.org" } }, "predictions": { "labels": [ { "name": "Another Parent Label", "probability": 0.954979807138443, "sentiment": -0.4281917143125379 }, { "name": "Another Parent Label > Another Child Label", "probability": 0.7726812064647675, "sentiment": -0.6603664430231163 } ] } } ], "continuation": "2021-02-16T10:55:05Z.c060a787c0b2bbf95526ad5cf28bf582", "status": "ok" }
{ "comments": [ { "annotations": { "labels": { "assigned": [ { "name": "Parent Label", "sentiment": "positive" }, { "name": "Parent Label > Child Label", "sentiment": "positive" } ] } }, "comment": { "context": "1596721237668", "created_at": "2020-08-06T13:20:28.531000Z", "has_annotations": true, "id": "0123456789abcdef", "last_modified": "2020-08-06T13:40:37.668000Z", "messages": [ { "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, "from": "bob@organisation.org", "sent_at": "2011-12-11T11:05:10Z", "subject": { "text": "Today's figures" }, "to": ["alice@company.com"] } ], "source_id": "47194279497e141e", "text_format": "plain", "thread_id": "123456", "timestamp": "2011-12-11T11:05:10Z", "uid": "47194279497e141e.0123456789abcdef", "user_properties": { "string:Recipient Domain": "company.com", "string:Sender Domain": "organisation.org" } }, "predictions": { "labels": [ { "name": "Another Parent Label", "probability": 0.954979807138443, "sentiment": -0.4281917143125379 }, { "name": "Another Parent Label > Another Child Label", "probability": 0.7726812064647675, "sentiment": -0.6603664430231163 } ] } } ], "continuation": "2021-02-16T10:55:05Z.c060a787c0b2bbf95526ad5cf28bf582", "status": "ok" }
This route lets you export a dataset. It returns a list of comments with assigned labels and latest available predictions.
Other ways to export a dataset are CSV download in the browser and JSONL download via the CLI. For a detailed comparison,
see the comparison table.
Request Format
NAME | TYPE | REQUIRED | DESCRIPTION |
---|---|---|---|
comment_uids | array<string> | no | A list of at most 256 comment UIDs (in the format of source_id.comment_id). If provided, only these comments will be included
in the response. No other filters may be passed with comment_uids .
|
source_ids | array<string> | no | A list of at most 1024 source IDs. If provided, only comments from these sources will be included in the response. |
order_by | string | no | One of created_at or timestamp . If provided returns the comments sorted by either the API creation date of the comments (created_at ), or the user defined comment timestamp (timestamp ). The default is timestamp .
|
from | string | no | An ISO-8601 timestamp. If provided, returns comments only from this timestamp onwards. The related order_by field controls which timestamp will be used for filtering.
|
to | string | no | An ISO-8601 timestamp. If provided, returns comments only until this timestamp (inclusive). The related order_by field controls which timestamp will be used for filtering.
|
continuation | string | no | Pagination token (provided in the response). Should be used to fetch the next limit number of comments.
|
limit | number | no | Number of comments returned per response up to a maximum of 256. Default: 64. |
Response Format
NAME | TYPE | DESCRIPTION |
---|---|---|
comments | array<Comment> | A list of comments with their assigned and predicted labels. |
continuation | string | Pagination token to fetch the next limit number of comments. If there are no further comments, this field will not be present in the response.
|
Where
Comment
has the following format:
NAME | TYPE | DESCRIPTION |
---|---|---|
comment | object | Comment object. The format is described in the Comment Reference. |
annotations | object | An object containing a single field labels.assigned which is a list of labels assigned to this comment. The format is described in the Labels Reference. Note that it won't include predictions as these labels are assigned, not predicted.
|
predictions | object | An object containing a single field labels which is a list of labels predicted for this comment. The format is described in the Labels Reference.
|