
Communications Mining 用户指南
- 标签
- 通用字段
Labels describe the entire message, for example, Cancellation, Trade failure, or Urgent. General fields refer to specific parts of the message, for example, Counterparty name, Customer ID, or Cancellation date.
In a downstream process, labels are used to triage, prioritize, and decide what kind of action should be taken. General fields are used to fill in fields of requests. For example, a downstream process may filter messages to those that have the Cancellation label, and then use the extracted Customer ID and Cancellation date general fields to call an API to automatically process the cancellation.
Communications Mining comes with a number of built-in general fields for common concepts, such as Organization, Currency Code, or Date. You can customize the built-in general fields of Communications Mining so that they are tailored to your specific use case. For example, Communications Mining has a highly trained pre-built Date general field which you can use as a starting point for a more customized general field such as Renewal Date or Cancellation Date. Alternatively, you can start from scratch and teach Communications Mining to recognize something completely new.
此邮箱偶尔会收到紧急的续订请求、取消请求和管理员请求。Communications Mining™ 已经过训练,可以识别每个概念,并且 Communications Mining 预测可通过创建支持工单,将电子邮件分类到正确的团队。
由于保单编号格式特定于该特定保险公司,因此我们将常规字段配置为可从头开始训练。 另一方面,参保组织是一种组织,因此我们根据内置的组织常规字段将其配置为可培训。 最后,我们注意到代理并不总是将其名称输入到电子邮件中,因此我们决定使用代理电子邮件地址(可从注释元数据中获取)在内部数据库中查找相应的名称,而不是将其提取为常规字段。
The following table summarizes these approaches.
配置 | 何时使用 | 示例 |
---|---|---|
不含基本通用字段的可训练通用字段 | 最常用于各种内部 ID,或者在 Communications Mining 中没有合适的基本常规字段时使用。 | 保单编号、客户 ID |
具有基本通用字段的可训练通用字段 | 用于自定义 Communications Mining 中现有的预构建常规字段。 | 取消日期(基于日期)、受保组织(基于组织) |
预构建的常规字段(不可训练) | 用于应完全按照定义匹配的一般字段,否则训练会导致出错。 | 位于 |
使用注释元数据代替常规字段 | 当注释元数据中已以结构化形式显示所需信息时使用。 | 发件人地址、发件人域 |
Communications Mining™ 提供多种获取预测(包括预测通用字段)的方法。请参阅数据下载概述,了解哪种方法最适合您的用例。
无论选择哪种方法,您都需要了解以下边缘情况,并在应用程序中进行处理:
- 响应中并未包含所有预期常规字段
- 响应包含一个或多个常规字段的多个匹配项
- 并非响应中显示的所有常规字段都正确
在本节中,我们将更详细地介绍每种边缘情况。
响应中并未包含所有常规字段
响应包含一个或多个常规字段的多个匹配项
Note that you can use the metadata in the response when handling such cases. For example, we can choose to preferentially pick policy numbers that appear in the email subject over those that appear in the email body. The following example shows the response that the API will return for our example email.
{
"predictions": [
{
"uid": "aa05ba2250de48e3.7588b85f68f81c3b",
"labels": [...],
"entities": [
{
"id": "6a1d11118b60868e",
"name": "policy-number",
"span": {
"content_part": "body",
"message_index": 0,
"utf16_byte_start": 200,
"utf16_byte_end": 222,
"char_start": 100,
"char_end": 111
},
"kind": "policy-number",
"formatted_value": "GHI-0204963"
},
{
"id": "6a1d11118b60868e",
"name": "policy-number",
"span": {
"content_part": "subject",
"message_index": 0,
"utf16_byte_start": 0,
"utf16_byte_end": 22,
"char_start": 0,
"char_end": 11
},
"kind": "policy-number",
"formatted_value": "GHI-0068448"
},
{...},
{...},
{...}
]
}
],
"model": {
"version": 31,
"time": "2021-07-14T15:00:57.608000Z"
},
"status": "ok"
}
{
"predictions": [
{
"uid": "aa05ba2250de48e3.7588b85f68f81c3b",
"labels": [...],
"entities": [
{
"id": "6a1d11118b60868e",
"name": "policy-number",
"span": {
"content_part": "body",
"message_index": 0,
"utf16_byte_start": 200,
"utf16_byte_end": 222,
"char_start": 100,
"char_end": 111
},
"kind": "policy-number",
"formatted_value": "GHI-0204963"
},
{
"id": "6a1d11118b60868e",
"name": "policy-number",
"span": {
"content_part": "subject",
"message_index": 0,
"utf16_byte_start": 0,
"utf16_byte_end": 22,
"char_start": 0,
"char_end": 11
},
"kind": "policy-number",
"formatted_value": "GHI-0068448"
},
{...},
{...},
{...}
]
}
],
"model": {
"version": 31,
"time": "2021-07-14T15:00:57.608000Z"
},
"status": "ok"
}