communications-mining
latest
false
Communications Mining Developer Guide
Last updated Oct 3, 2024

Streams

Get comments from stream

Permissions required: Consume streams, View labels, View sources.

Note:
The /results route is the new way of fetching comments and their predictions from a stream, replacing the existing /fetch route (Streams - legacy). We maintain the /fetch route for legacy support, but we recommend that all the new use cases use the /results route, as it supports all the possible use cases, including those using generative extraction.
  • Bash
    curl -X GET 'https://<my_api_endpoint>/api/preview/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15' \
        -H "Authorization: Bearer $REINFER_TOKEN"curl -X GET 'https://<my_api_endpoint>/api/preview/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15' \
        -H "Authorization: Bearer $REINFER_TOKEN"
  • Node
    const request = require("request");
    
    request.get(
      {
        url: "https://<my_api_endpoint>/api/preview/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15",
        headers: {
          Authorization: "Bearer " + process.env.REINFER_TOKEN,
        },
      },
      function (error, response, json) {
        // digest response
        console.log(JSON.stringify(json, null, 2));
      }
    );const request = require("request");
    
    request.get(
      {
        url: "https://<my_api_endpoint>/api/preview/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15",
        headers: {
          Authorization: "Bearer " + process.env.REINFER_TOKEN,
        },
      },
      function (error, response, json) {
        // digest response
        console.log(JSON.stringify(json, null, 2));
      }
    );
  • Python
    import json
    import os
    
    import requests
    
    response = requests.get(
        "https://<my_api_endpoint>/api/preview/datasets/project1/collateral/streams/dispute/results",
        headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]},
        params={"max_results": 5, "max_filtered": 15},
    )
    
    print(json.dumps(response.json(), indent=2, sort_keys=True))import json
    import os
    
    import requests
    
    response = requests.get(
        "https://<my_api_endpoint>/api/preview/datasets/project1/collateral/streams/dispute/results",
        headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]},
        params={"max_results": 5, "max_filtered": 15},
    )
    
    print(json.dumps(response.json(), indent=2, sort_keys=True))
  • Response
    {
      "status": "ok",
      "results": [
        {
          "comment": {
            "uid": "18ba5ce699f8da1f.0123456789abcdef",
            "id": "0123456789abcdef",
            "timestamp": "2018-09-17T09:54:56.332000Z",
            "user_properties": {
              "number:Messages": 1,
              "string:Folder": "Sent (/ Sent)",
              "string:Has Signature": "Yes",
              "string:Message ID": "<abcdef@abc.company.com>",
              "string:Sender": "alice@company.com",
              "string:Sender Domain": "company.com",
              "string:Thread": "<abcdef@abc.company.com>"
            },
            "messages": [
              {
                "from": "alice@company.com",
                "to": [
                  "bob@organisation.org"
                ],
                "sent_at": "2018-09-17T09:54:56.332000Z",
                "body": {
                  "text": "Hi Bob,\n\nCould you send me today's figures?"
                },
                "subject": {
                  "text": "Today's figures"
                },
                "signature": {
                  "text": "Thanks,\nAlice"
                }
              }
            ],
            "text_format": "plain",
            "attachments": [],
            "source_id": "18ba5ce699f8da1f",
            "last_modified": "2024-07-03T13:30:53.991000Z",
            "created_at": "2020-12-14T15:07:03.718000Z",
            "context": "1",
            "has_annotations": true
          },
          "prediction": {
            "taxonomies": [
              {
                "name": "default",
                "labels": [
                  {
                    "name": "Margin Call",
                    "occurrence_confidence": {
                      "value": 0.9905891418457031,
                      "thresholds": ["stream"]
                    },
                    "extraction_confidence": {
                      "value": 0.4712367373372217,
                      "thresholds": []
                    },
                    "fields": [
                      {
                        "name": "Notification Date",
                        "value": null
                      }
                    ]
                  },
                  {
                    "name": "Margin Call > Interest Accrual",
                    "occurrence_confidence": {
                      "value": 0.9905891418457031,
                      "thresholds": []
                    },
                    "extraction_confidence": {
                      "value": 0.9905891418457031,
                      "thresholds": []
                    },
                    "fields": [
                      {
                        "name": "Amount",
                        "value": {
                          "formatted": "636,000.00"
                        }
                      },
                      {
                        "name": "Broker number",
                        "value": null
                      },
                      {
                        "name": "Client name",
                        "value": null
                      },
                      {
                        "name": "Currency",
                        "value": {
                          "formatted": "AUD"
                        }
                      }
                    ]
                  }
                ],
                "general_fields": [
                  {
                    "name": "monetary-quantity",
                    "value": {
                      "formatted": "636,000.00 GBP"
                    }
                  },
                  {
                    "name": "MarginCallDateType",
                    "value": {
                      "formatted": "2018-09-21 00:00 UTC"
                    }
                  },
                  {
                    "name": "client-name",
                    "value": {
                      "formatted": "Big Client Example Bank"
                    }
                  }
                ]
              }
            ]
          },
          "continuation": "pmjKYXYBAAADqHUvPkQf1ypNCZFR37vu"
        }
      ],
      "num_filtered": 0,
      "more_results": true,
      "continuation": "pmjKYXYBAAAsXghZ2niXPNP6tOIJtL_8"
    }{
      "status": "ok",
      "results": [
        {
          "comment": {
            "uid": "18ba5ce699f8da1f.0123456789abcdef",
            "id": "0123456789abcdef",
            "timestamp": "2018-09-17T09:54:56.332000Z",
            "user_properties": {
              "number:Messages": 1,
              "string:Folder": "Sent (/ Sent)",
              "string:Has Signature": "Yes",
              "string:Message ID": "<abcdef@abc.company.com>",
              "string:Sender": "alice@company.com",
              "string:Sender Domain": "company.com",
              "string:Thread": "<abcdef@abc.company.com>"
            },
            "messages": [
              {
                "from": "alice@company.com",
                "to": [
                  "bob@organisation.org"
                ],
                "sent_at": "2018-09-17T09:54:56.332000Z",
                "body": {
                  "text": "Hi Bob,\n\nCould you send me today's figures?"
                },
                "subject": {
                  "text": "Today's figures"
                },
                "signature": {
                  "text": "Thanks,\nAlice"
                }
              }
            ],
            "text_format": "plain",
            "attachments": [],
            "source_id": "18ba5ce699f8da1f",
            "last_modified": "2024-07-03T13:30:53.991000Z",
            "created_at": "2020-12-14T15:07:03.718000Z",
            "context": "1",
            "has_annotations": true
          },
          "prediction": {
            "taxonomies": [
              {
                "name": "default",
                "labels": [
                  {
                    "name": "Margin Call",
                    "occurrence_confidence": {
                      "value": 0.9905891418457031,
                      "thresholds": ["stream"]
                    },
                    "extraction_confidence": {
                      "value": 0.4712367373372217,
                      "thresholds": []
                    },
                    "fields": [
                      {
                        "name": "Notification Date",
                        "value": null
                      }
                    ]
                  },
                  {
                    "name": "Margin Call > Interest Accrual",
                    "occurrence_confidence": {
                      "value": 0.9905891418457031,
                      "thresholds": []
                    },
                    "extraction_confidence": {
                      "value": 0.9905891418457031,
                      "thresholds": []
                    },
                    "fields": [
                      {
                        "name": "Amount",
                        "value": {
                          "formatted": "636,000.00"
                        }
                      },
                      {
                        "name": "Broker number",
                        "value": null
                      },
                      {
                        "name": "Client name",
                        "value": null
                      },
                      {
                        "name": "Currency",
                        "value": {
                          "formatted": "AUD"
                        }
                      }
                    ]
                  }
                ],
                "general_fields": [
                  {
                    "name": "monetary-quantity",
                    "value": {
                      "formatted": "636,000.00 GBP"
                    }
                  },
                  {
                    "name": "MarginCallDateType",
                    "value": {
                      "formatted": "2018-09-21 00:00 UTC"
                    }
                  },
                  {
                    "name": "client-name",
                    "value": {
                      "formatted": "Big Client Example Bank"
                    }
                  }
                ]
              }
            ]
          },
          "continuation": "pmjKYXYBAAADqHUvPkQf1ypNCZFR37vu"
        }
      ],
      "num_filtered": 0,
      "more_results": true,
      "continuation": "pmjKYXYBAAAsXghZ2niXPNP6tOIJtL_8"
    }

Once you create a stream, you can query it to fetch comments and their predictions. This includes labels, general fields, and label extractions, containing a set of extraction fields for each instance of that label occurring.

Check these aspects when fetching comments from a stream:

Comment Queue

When you create a stream, you set its initial position to be equal to its creation time. If needed, you can set the stream to a different position (either forwards or backwards in time), using the reset endpoint. The stream returns comments starting from its current position. You determine the position of the comment in the comment queue by the order in which you uploaded the comments.

Advancing Your Position in the Queue

As the stream only returns comments from its current position, you should advance it to the next position after each fetch request, by using the advance endpoint. This way, the API guarantees at-least-once processing of all comments. If your application fails while processing a batch, it will pick up the same batch on restart.
Note: Since an application can successfully process a comment but fail at the advance step, you can see a comment multiple times.

Depending on your application design, you can choose between:

  • advancing the stream once, for the whole batch. Use the batch's continuation contained in the response.
  • advancing the stream for each individual comment. Use the comment's continuation, contained in the response.

Comment Filter

If you specify a comment_filter when creating the stream, the results don't include comments not matching the filter, but still count towards the requested max_filtered. You can see responses where all of max_filtered comments are filtered out, leading to an empty results array. In the example below, you request a batch of 8 comments, all of which are filtered out.
{
  "filtered": 8,
  "results": [],
  "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
  "status": "ok"
}{
  "filtered": 8,
  "results": [],
  "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
  "status": "ok"
}
Pass the optional max_filtered parameter, to prevent filtered comments from counting towards the requested max_results.

Prediction thresholds

Note: The legacy /fetch route does not return comments with predictions that did not meet the confidence threshold.
In this new /results route, you return all the predictions for a comment, and the confidencevalue as well. You also indicate which type(s) of threshold it meets.
"occurrence_confidence": {
    "value": 0.9905891418457031,
    "thresholds": ["stream"]
  }  "occurrence_confidence": {
    "value": 0.9905891418457031,
    "thresholds": ["stream"]
  }
The confidence for a prediction 0.9905.. and the thresholds value indicates that the prediction meets the configured threshold for the stream.
When building an automation, look for the stream value, to confirm that the prediction meets the threshold you configured in the stream.

For more information about generated extractions, and how to work with thresholds, check the Understanding validation on extractions and extraction performance page.

Request Format

NAMETYPEREQUIREDDESCRIPTION
max_resultsnumbernoThe number of comments to fetch for this stream. Returns fewer comments if it reaches the end of the batch, or if you filter out comments according to the comment filter.. Max value is 32. Default is 16.
max_filterednumbernoConvenience parameter for streams with a comment filter. When you provide them, up to max_filtered filtered comments do not count towards the requested max_results. This is useful if you expect a large number of comments to not match the filter. Has no effect on streams without a comment filter. Max value is 1024. Default is null.

Response Format

NAMETYPEDESCRIPTION
statusstringok if the request is successful, or error, in case of an error. To learn more about error responses, check the Overview page.
num_filterednumberNumber of comments that were filtered out based on a comment filter. If you created the stream without a filter, this number is always 0.
continuationstringThe batch continuation token. Use it to acknowledge the processing of this batch, and advance the stream to the next batch.
more_resultsboolTrue if there were no additional results in the stream, when you made the request. False otherwise.
resultsarray<Result>An array containing result objects.
Where Result has the following format:
NAMETYPEDESCRIPTION
commentCommentComment data. For a detailed explanation, see the Comment Reference.
continuationstringThe comment's continuation token. Used to acknowledge processing of this comment and advance stream to the next comment.
predictionarray<Prediction>The prediction for this comment. Is available only if the stream specifies a model version. For more information about generative predictions, check the: Communications Mining - Understanding validation on extractions and extraction performance page.
Prediction has the following format:
NAMETYPEDESCRIPTION
taxonomiesarray<TaxonomyPrediction>List of taxonomy predictions. You currently define only one taxonomy per dataset, but you provide it as a list, for future compatibility.
TaxonomyPrediction has the following format:
NAMETYPEDESCRIPTION
namestringName of the taxonomy. The only value is currently default.
labelsarray<LabelPrediction>A list of extracted label predictions with their occurrence_confidence, extraction_confidence and extracted fields. For more information about generative predictions, check the Communications Mining - Understanding validation on extractions and extraction performance page.
general_fieldsarray<FieldPrediction>A list of extracted general field predictions with their name and extracted value. For more information about generative predictions, check the Communications Mining - Understanding validation on extractions and extraction performance page.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.