communications-mining

latest

false

重要 :

Communications Mining 现在是 UiPath IXP 的一部分。有关更多详细信息，请查看用户指南中的简介。请注意，此内容已使用机器翻译进行了本地化。新发布内容的本地化可能需要 1-2 周的时间才能完成。

Communications Mining 开发者指南

上次更新日期 2025年2月10日

从流中获取结果

所需权限：使用流、查看标签、查看来源。

备注：

/results路由是从流中获取注释及其预测的新方法，以替换现有的/fetch路由（流 - 旧版）。我们维护 /获取路由以实现旧版支持，但建议所有新用例使用/results路由，因为它支持所有可能的用例，包括使用生成式提取的用例。

重击

curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15' \
    -H "Authorization: Bearer $REINFER_TOKEN"curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15' \
    -H "Authorization: Bearer $REINFER_TOKEN"

节点

const request = require("request");

request.get(
  {
    url: "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15",
    headers: {
      Authorization: "Bearer " + process.env.REINFER_TOKEN,
    },
  },
  function (error, response, json) {
    // digest response
    console.log(JSON.stringify(json, null, 2));
  }
);const request = require("request");

request.get(
  {
    url: "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results?max_results=5&max_filtered=15",
    headers: {
      Authorization: "Bearer " + process.env.REINFER_TOKEN,
    },
  },
  function (error, response, json) {
    // digest response
    console.log(JSON.stringify(json, null, 2));
  }
);

Python

import json
import os

import requests

response = requests.get(
    "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results",
    headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]},
    params={"max_results": 5, "max_filtered": 15},
)

print(json.dumps(response.json(), indent=2, sort_keys=True))import json
import os

import requests

response = requests.get(
    "https://<my_api_endpoint>/api/v1/datasets/project1/collateral/streams/dispute/results",
    headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]},
    params={"max_results": 5, "max_filtered": 15},
)

print(json.dumps(response.json(), indent=2, sort_keys=True))

响应

{
  "status": "ok",
  "results": [
    {
      "comment": {
        "uid": "18ba5ce699f8da1f.0123456789abcdef",
        "id": "0123456789abcdef",
        "timestamp": "2018-09-17T09:54:56.332000Z",
        "user_properties": {
          "number:Messages": 1,
          "string:Folder": "Sent (/ Sent)",
          "string:Has Signature": "Yes",
          "string:Message ID": "<abcdef@abc.company.com>",
          "string:Sender": "alice@company.com",
          "string:Sender Domain": "company.com",
          "string:Thread": "<abcdef@abc.company.com>"
        },
        "messages": [
          {
            "from": "alice@company.com",
            "to": [
              "bob@organisation.org"
            ],
            "sent_at": "2018-09-17T09:54:56.332000Z",
            "body": {
              "text": "Hi Bob,\n\nCould you send me today's figures?"
            },
            "subject": {
              "text": "Today's figures"
            },
            "signature": {
              "text": "Thanks,\nAlice"
            }
          }
        ],
        "text_format": "plain",
        "attachments": [],
        "source_id": "18ba5ce699f8da1f",
        "last_modified": "2024-07-03T13:30:53.991000Z",
        "created_at": "2020-12-14T15:07:03.718000Z",
        "context": "1",
        "has_annotations": true
      },
      "prediction": {
        "taxonomies": [
          {
            "name": "default",
            "labels": [
              {
                "name": "Margin Call",
                "occurrence_confidence": {
                  "value": 0.9905891418457031,
                  "thresholds": ["stream"]
                },
                "extraction_confidence": {
                  "value": 0.4712367373372217,
                  "thresholds": []
                },
                "fields": [
                  {
                    "name": "Notification Date",
                    "value": null
                  }
                ]
              },
              {
                "name": "Margin Call > Interest Accrual",
                "occurrence_confidence": {
                  "value": 0.9905891418457031,
                  "thresholds": []
                },
                "extraction_confidence": {
                  "value": 0.9905891418457031,
                  "thresholds": []
                },
                "fields": [
                  {
                    "name": "Amount",
                    "value": {
                      "formatted": "636,000.00"
                    }
                  },
                  {
                    "name": "Broker number",
                    "value": null
                  },
                  {
                    "name": "Client name",
                    "value": null
                  },
                  {
                    "name": "Currency",
                    "value": {
                      "formatted": "AUD"
                    }
                  }
                ]
              }
            ],
            "general_fields": [
              {
                "name": "monetary-quantity",
                "value": {
                  "formatted": "636,000.00 GBP"
                }
              },
              {
                "name": "MarginCallDateType",
                "value": {
                  "formatted": "2018-09-21 00:00 UTC"
                }
              },
              {
                "name": "client-name",
                "value": {
                  "formatted": "Big Client Example Bank"
                }
              }
            ]
          }
        ]
      },
      "continuation": "pmjKYXYBAAADqHUvPkQf1ypNCZFR37vu"
    }
  ],
  "num_filtered": 0,
  "more_results": true,
  "continuation": "pmjKYXYBAAAsXghZ2niXPNP6tOIJtL_8"
}{
  "status": "ok",
  "results": [
    {
      "comment": {
        "uid": "18ba5ce699f8da1f.0123456789abcdef",
        "id": "0123456789abcdef",
        "timestamp": "2018-09-17T09:54:56.332000Z",
        "user_properties": {
          "number:Messages": 1,
          "string:Folder": "Sent (/ Sent)",
          "string:Has Signature": "Yes",
          "string:Message ID": "<abcdef@abc.company.com>",
          "string:Sender": "alice@company.com",
          "string:Sender Domain": "company.com",
          "string:Thread": "<abcdef@abc.company.com>"
        },
        "messages": [
          {
            "from": "alice@company.com",
            "to": [
              "bob@organisation.org"
            ],
            "sent_at": "2018-09-17T09:54:56.332000Z",
            "body": {
              "text": "Hi Bob,\n\nCould you send me today's figures?"
            },
            "subject": {
              "text": "Today's figures"
            },
            "signature": {
              "text": "Thanks,\nAlice"
            }
          }
        ],
        "text_format": "plain",
        "attachments": [],
        "source_id": "18ba5ce699f8da1f",
        "last_modified": "2024-07-03T13:30:53.991000Z",
        "created_at": "2020-12-14T15:07:03.718000Z",
        "context": "1",
        "has_annotations": true
      },
      "prediction": {
        "taxonomies": [
          {
            "name": "default",
            "labels": [
              {
                "name": "Margin Call",
                "occurrence_confidence": {
                  "value": 0.9905891418457031,
                  "thresholds": ["stream"]
                },
                "extraction_confidence": {
                  "value": 0.4712367373372217,
                  "thresholds": []
                },
                "fields": [
                  {
                    "name": "Notification Date",
                    "value": null
                  }
                ]
              },
              {
                "name": "Margin Call > Interest Accrual",
                "occurrence_confidence": {
                  "value": 0.9905891418457031,
                  "thresholds": []
                },
                "extraction_confidence": {
                  "value": 0.9905891418457031,
                  "thresholds": []
                },
                "fields": [
                  {
                    "name": "Amount",
                    "value": {
                      "formatted": "636,000.00"
                    }
                  },
                  {
                    "name": "Broker number",
                    "value": null
                  },
                  {
                    "name": "Client name",
                    "value": null
                  },
                  {
                    "name": "Currency",
                    "value": {
                      "formatted": "AUD"
                    }
                  }
                ]
              }
            ],
            "general_fields": [
              {
                "name": "monetary-quantity",
                "value": {
                  "formatted": "636,000.00 GBP"
                }
              },
              {
                "name": "MarginCallDateType",
                "value": {
                  "formatted": "2018-09-21 00:00 UTC"
                }
              },
              {
                "name": "client-name",
                "value": {
                  "formatted": "Big Client Example Bank"
                }
              }
            ]
          }
        ]
      },
      "continuation": "pmjKYXYBAAADqHUvPkQf1ypNCZFR37vu"
    }
  ],
  "num_filtered": 0,
  "more_results": true,
  "continuation": "pmjKYXYBAAAsXghZ2niXPNP6tOIJtL_8"
}

创建流后，您可以查询该流以获取注释及其预测。这包括标签、通用字段和标签提取，其中包含该标签的每个实例的一组提取字段。

注释队列

创建流时，请将其初始位置设置为等于其创建时间。如果需要，您可以使用reset 端点将流设置到其他位置（在时间上向前或向后）。流从其当前位置开始返回注释。您可以根据上传注释的顺序确定注释在注释队列中的位置。

推进您在队列中的位置

由于流仅从其当前位置返回注释，因此您应该在每个提取请求后使用推进端点将其推进到下一个位置。这样，API 可确保“至少处理一次”所有注释。如果应用程序在处理批处理时失败，则它将在重新启动时选取同一批次的文件。

注意：由于应用程序可以成功处理注释，但在高级步骤中失败，因此您可以多次看到同一个注释。

根据应用程序设计，您可以选择：

为整个批处理推进一次流。使用响应中包含的批次的continuation 。
为每个单独的注释推进流。使用响应中包含的注释的continuation 。

注释筛选器

如果在创建流时指定了comment_filter ，则结果不包含与筛选器不匹配的注释，但仍计入请求的max_filtered 。您可以看到响应中，所有max_filtered注释都被筛选掉，导致空的results数组。在下面的示例中，您请求了一批 8 个注释，所有这些注释都被筛选掉。

{
  "filtered": 8,
  "results": [],
  "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
  "status": "ok"
}{
  "filtered": 8,
  "results": [],
  "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
  "status": "ok"
}

传递可选的max_filtered参数，以防止将筛选的注释计入请求的max_results中。

预测阈值

注意：旧版/fetch路由不返回预测不符合置信度阈值的注释。

在此新的/results路由中，您将返回注释的所有预测，以及confidence value 。您还需指明它符合哪种类型的阈值。

"occurrence_confidence": {
    "value": 0.9905891418457031,
    "thresholds": ["stream"]
  }  "occurrence_confidence": {
    "value": 0.9905891418457031,
    "thresholds": ["stream"]
  }

预测confidence 0.9905..和thresholds 值的表示该预测满足为stream 配置的阈值。

构建自动化时，请查找stream值，以确认预测符合您在流中配置的阈值。

有关生成的提取以及如何使用阈值的更多信息，请查看“了解提取的验证和提取性能”页面。

请求格式

名称	类型	必填	说明
`max_results`	数字	否	要为此流获取的注释数量。如果到达批次末尾，或者您根据注释筛选器筛选出注释，则返回较少的注释。最大值为 32。默认值为 16。
`max_filtered`	数字	否	具有注释筛选器的流的便捷参数。提供时，最多`max_filtered`个已筛选注释不计入请求的`max_results`中。如果您预计会有大量注释与筛选器不匹配，这将非常有用。对没有注释筛选器的流没有效果。最大值为 1024。默认值为 null。

响应格式

名称	类型	说明
`status`	字符串	`ok` 如果请求成功，则为`error` ，如果发生错误。要了解有关错误响应的更多信息，请查看概述页面。
`num_filtered`	数字	根据注释筛选器筛选出的注释数量。如果您在创建流时不使用筛选器，则此数字始终为`0` 。
`continuation`	字符串	批次继续令牌。使用它来确认此批次的处理，并将流推进到下一个批次。
`more_results`	Bool	当发出请求时，如果流中没有其他结果，则返回 True 否则返回 false。
`results`	array<Result>	包含结果对象的数组。

其中Result具有以下格式：

名称	类型	说明
`comment`	注释	注释数据。有关详细说明，请参阅注释参考。
`continuation`	字符串	注释的继续令牌。用于确认处理此注释并将流推进到下一条注释。
`prediction`	array<Prediction>	此评论的预测。仅在流指定模型版本时可用。有关生成式预测的更多信息，请查看： Communications Mining - 了解提取验证和提取性能页面。

Prediction具有以下格式：

名称	类型	说明
`taxonomies`	数组<TaxonomyPrediction>	分类预测列表。您目前仅为每个数据集定义一个分类，但您可将其以列表的形式提供，以供日后使用。

TaxonomyPrediction具有以下格式：

名称	类型	说明
`name`	字符串	分类名称。当前唯一的值为`default` 。
`labels`	数组<LabelPrediction>	提取的标签预测及其`occurrence_confidence` 、 `extraction_confidence`和提取的`fields`的列表。有关生成式预测的更多信息，请查看Communications Mining - 了解提取的验证和提取性能页面。
`general_fields`	数组<FieldPrediction>	提取的通用字段预测及其`name`和提取的`value`的列表。有关生成式预测的更多信息，请查看Communications Mining - 了解提取的验证和提取性能页面。