ixp

latest

false

重要 :

新发布内容的本地化可能需要 1-2 周的时间才能完成。

Communications Mining 用户指南

注释

Communications Mining™ 中的每条消息都由 API 中的单个注释对象表示。因此，可以将它们视为等效。开发者文档和 API 将主要引用comments ，而用户指南和 Communications Mining 用户界面主要引用messages 。

将数据上传到 Communications Mining 或从 Communications Mining 获取数据时，了解如何将不同类型的数据（例如电子邮件或支持工单）表示为注释非常重要。本页说明如何将数据建模为 Communications Mining comments ，以准备上传，以及如何理解从 Communications Mining 中获取的数据。

根据电子邮件创建的注释示例

根据审核创建的 Communications Mining™ 注释

“概述”部分描述注释对象的整体结构。如果要通过 API 将数据上传到 Communications Mining™，或了解如何处理通过 API 上传到 Communications Mining 的数据，请查看“通过 API 创建的注释”部分。您可以找到每种常用注释类型（电子邮件或支持工单）的详细说明。如果想更好地了解如何处理通过集成上传到 Communications Mining 的数据，请查看“集成创建的注释”部分。最后，有关可用注释对象字段的完整列表，请查看参考部分。

概述

Communications Mining™ 可处理各种类型的文本数据，例如电子邮件、调查响应、支持工单或客户评论。这些类型的数据的共同点是，它们都由通信单元（电子邮件、调查回复、支持工单、客户评论）组成。例如，在 Communications Mining 中，单个消息表示为注释。

无论注释代表哪种通信单位，它都会始终保持以下基本结构：

{
  "id": <UNIQUE ID>,
  "timestamp": <TIMESTAMP>,
  "messages": [
    {
      "body": { "text": <TEXT> },
      ...
    }
  ],
  "user_properties": { ... },
}
{
  "id": <UNIQUE ID>,
  "timestamp": <TIMESTAMP>,
  "messages": [
    {
      "body": { "text": <TEXT> },
      ...
    }
  ],
  "user_properties": { ... },
}

如前面的代码片段所示，除了实际的文本段，注释还总是有一个 ID 和一个时间戳。在包含消息的页面中，该 ID 必须唯一。时间戳用于在平台用户界面中按日期进行筛选和排序，并生成基于日期的分析。

除了这些必填字段外，还应根据注释的类型设置其他字段。如果已通过集成将数据上传到 Communications Mining™，则 Communications Mining 会自动填充所有必填字段。请查看以下部分以获取更详细的说明。

通过 API 创建的注释

电子邮件

虽然将电子邮件同步到 Communications Mining™ 的最简单方法是通过Exchange 集成，但如果您自己提取电子邮件，则可以通过 API 同步电子邮件。对于原始电子邮件，请使用sync-raw-emails端点，对于经过处理的电子邮件，请使用sync端点。

同步原始电子邮件时，请按原样提供提取的 MIME 电子邮件标头和电子邮件正文（查看参考资料以获取原始电子邮件格式的说明）。Communications Mining 解析标头并清理电子邮件正文。

备注：

为简洁起见，以下原始电子邮件示例显示了极少量的标头。将所有提取的标头发送到 Communications Mining，这可能比示例中的标头长得多。

重要提示：

Communications Mining 如何处理原始电子邮件？

在消息对象messages[0]中设置特定于电子邮件的字段
设置thread_id字段和thread_properties对象
通过去除带引号的电子邮件并将签名放入单独的signature字段中，来清理电子邮件正文
使用从电子邮件标头中提取的元数据填充user_properties对象。如果电子邮件中不存在某个字段，则根本不会在注释中设置该字段（而不是将其设置为 null 或空值）。例如，以下示例中的注释不包含BCC:字段。

如果在上传到 Communications Mining 之前使用其他数据丰富电子邮件，则可以在注释的用户属性中提供这些附加数据。

处理后的原始电子邮件如以下处理后的电子邮件示例所示。检查 Communications Mining 创建的其他字段数量。如果要上传已处理的电子邮件，请按照已处理的电子邮件示例中的方式构建这些电子邮件。

电子邮件示例

原始电子邮件

{
  "raw_email": {
    "body": {
      "plain": "Hi Bob,\n\nCould you send me the figures for today?\n\nThanks,\nAlice"
    },
    "headers": {
      "raw": "From: Alice Smith <alice@example.com>\nDate: Tue, 3 Aug 2021 10:57:42 +0100\nMessage-ID: <e7784b5b@mail.example.com>\nSubject: Figures for today\nTo: Bob <bob@company.com>\nCc: Joe <joe@company.com>"
    }
  },
  "user_properties": {
    "string:Team": "Team XYZ"
  }
}
{
  "raw_email": {
    "body": {
      "plain": "Hi Bob,\n\nCould you send me the figures for today?\n\nThanks,\nAlice"
    },
    "headers": {
      "raw": "From: Alice Smith <alice@example.com>\nDate: Tue, 3 Aug 2021 10:57:42 +0100\nMessage-ID: <e7784b5b@mail.example.com>\nSubject: Figures for today\nTo: Bob <bob@company.com>\nCc: Joe <joe@company.com>"
    }
  },
  "user_properties": {
    "string:Team": "Team XYZ"
  }
}

已处理的电子邮件

{
  "comment": {
    "id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e",
    "timestamp": "2021-08-03T09:57:42Z",
    "user_properties": {
      "string:Has Signature": "Yes",
      "string:Sender": "alice@example.com",
      "string:Thread": "<e7784b5b@mail.example.com>",
      "string:Message ID": "<e7784b5b@mail.example.com>",
      "number:Recipient Count": 2,
      "number:Participant Count": 3,
      "number:Position in Thread": 1,
      "string:Sender Domain": "example.com",
      "string:Team": "Team XYZ"
    },
    "messages": [
      {
        "body": {
          "text": "Hi Bob,\n\nCould you send me the figures for today?"
        },
        "signature": {
          "text": "Thanks,\nAlice"
        },
        "subject": {
          "text": "Figures for today"
        },
        "to": ["\"Bob\" <bob@company.com>"],
        "cc": ["\"Joe\" <joe@company.com>"],
        "sent_at": "2021-08-03T09:57:42Z",
        "from": "\"Alice Smith\" <alice@example.com>"
      }
    ],
    "thread_id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e"
  },
  "thread_properties": {
    "duration": null,
    "response_time": null,
    "num_messages": 1,
    "num_participants": 3,
    "first_sender": "alice@example.com",
    "thread_position": 0
  }
}
{
  "comment": {
    "id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e",
    "timestamp": "2021-08-03T09:57:42Z",
    "user_properties": {
      "string:Has Signature": "Yes",
      "string:Sender": "alice@example.com",
      "string:Thread": "<e7784b5b@mail.example.com>",
      "string:Message ID": "<e7784b5b@mail.example.com>",
      "number:Recipient Count": 2,
      "number:Participant Count": 3,
      "number:Position in Thread": 1,
      "string:Sender Domain": "example.com",
      "string:Team": "Team XYZ"
    },
    "messages": [
      {
        "body": {
          "text": "Hi Bob,\n\nCould you send me the figures for today?"
        },
        "signature": {
          "text": "Thanks,\nAlice"
        },
        "subject": {
          "text": "Figures for today"
        },
        "to": ["\"Bob\" <bob@company.com>"],
        "cc": ["\"Joe\" <joe@company.com>"],
        "sent_at": "2021-08-03T09:57:42Z",
        "from": "\"Alice Smith\" <alice@example.com>"
      }
    ],
    "thread_id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e"
  },
  "thread_properties": {
    "duration": null,
    "response_time": null,
    "num_messages": 1,
    "num_participants": 3,
    "first_sender": "alice@example.com",
    "thread_position": 0
  }
}

线程属性

以下线程属性可用。

名称	说明
`thread_position`	会话中注释的位置，通过按`timestamp`对注释排序计算得出。开始时间`0` 。
`num_messages`	会话中的注释数量。
`num_participants`	会话中唯一参与者（发件人、收件人、抄送、密件抄送）总数。
`first_sender`	会话中第一个注释的发件人。
`duration`	会话中第一个注释与最后一个注释的`timestamps`之间的差异 (以秒为单位)。如果，则系统会设置为`nullnum_messages` 是 1（即线程仅包含 1 个注释）。注意：注释的`timestamp`与相应原始电子邮件的`sent_at`字段相对应。
`response_time`	会话中第一个注释与会话中第一个响应之间的差异（以秒为单位）。会话中的第一个响应是发件人不是`first_sender`的最早的注释。如果会话中没有任何响应（即会话中的所有电子邮件都来自同一发件人），则设置为`null` 。

每次将新注释添加到平台时，都会更新相应线程的线程属性。

备注：

除了thread_position外，会话中每条注释的所有属性都相同。

支持票证

除正文外，通过表单提交的典型支持工单还可能包含主题、发件人信息（例如姓名或电子邮件地址）以及其他可以上传的结构化数据（例如工单主题）。作为注释的用户属性的一部分。

以下示例显示了如何将支持工单格式化为 Communications Mining™ 注释，以及该注释在平台的用户界面中的显示方式。您的用户属性可能会因您收集的数据而异。

支持工单示例

{
  "id": "dbcb03ad",
  "timestamp": "2020-02-26T16:09:00Z",
  "messages": [
    {
      "body": {
        "text": "Hi Support Team\n\nPlease could you look into my broadband service network status. I don't have any signal."
      },
      "subject": {
        "text": "Network Outage for over 24 hours - Customer account number 1234567"
      },
      "from": "alice.smith@example.com"
    }
  ],
  "user_properties": {
    "string:Customer Name": "Alice Smith",
    "string:Source": "Support Form",
    "string:Topic": "Broadband"
  }
}
{
  "id": "dbcb03ad",
  "timestamp": "2020-02-26T16:09:00Z",
  "messages": [
    {
      "body": {
        "text": "Hi Support Team\n\nPlease could you look into my broadband service network status. I don't have any signal."
      },
      "subject": {
        "text": "Network Outage for over 24 hours - Customer account number 1234567"
      },
      "from": "alice.smith@example.com"
    }
  ],
  "user_properties": {
    "string:Customer Name": "Alice Smith",
    "string:Source": "Support Form",
    "string:Topic": "Broadband"
  }
}

集成创建的注释

电子邮件 (Microsoft Exchange)

与处理原始电子邮件一样，通过 Exchange 集成提取到 Communications Mining 的 Microsoft Exchange 电子邮件会自动转换为注释对象。

附件和附件内容

注释可能已附加文件。如果注释包含附件，则attachments字段包含有关附件的元数据：

json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", } ], // other comment fields omitted ... },
json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", } ], // other comment fields omitted ... },

此外，您还可以下载附件的内容。下载附件内容后将返回attachment_reference字段：

json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", "attachment_reference": "CjQSEIExTHEqtdntoxz2WtbZDNEiIIVqcP1Sfx2L4epyRQDasa1RSODvheQ3bvLhj3L-_81G" } ], // other comment fields omitted ... },
json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", "attachment_reference": "CjQSEIExTHEqtdntoxz2WtbZDNEiIIVqcP1Sfx2L4epyRQDasa1RSODvheQ3bvLhj3L-_81G" } ], // other comment fields omitted ... },

使用attachment_reference从附件 API 检索二进制文件内容。对于之前的示例，您可获取以下 URL： https://cloud.uipath.com/ / /reinfer_/api/v1/attachments/CjQSEIExTHEqtdntoxz2WtbZDNEiIIVqcP1Sfx2L4epyRQDasa1RSODvheQ3bvLhj3L-_81G。

有关此类请求的更多详细信息，请查看 API 参考。

如果附件对象没有attachment_reference属性，您将无法下载附件的内容。这可能是因为：

Communications Mining™ 未收到附件的内容。
附件内容超过了上传到 Communications Mining 的大小限制。
Communications Mining 在支持文件内容之前处理了附件。

在附件页面了解有关附件内容的更多信息。

参考

注释

查看下表中可用注释字段的列表。如果您不熟悉 Communications Mining™ 注释对象，请查看概述。

名称	类型	必填	说明
`id`	字符串	是	在源中唯一标识注释。任何不超过 1024 个字符的十六进制字符串都是有效的（符合`/[0-9a-f]{1,1024}/` ）。
`timestamp`	字符串	是	指明注释创建时间的 ISO-8601 时间戳。如果时间戳未指定时区，则采用 UTC。时间戳必须在 1950-01-01T00:00:00Z 到 2049-12-31T23:59:59Z（含）范围内。
`messages`	`array<Message>`	是	包含零条或一条消息的数组。
`user_properties`	`映射 < 字符串，字符串	“数字”>`	否
`thread_id`	字符串	否	唯一标识电子邮件会话的 ID。任何不超过 1024 个字符的十六进制字符串都是有效的（符合`/[0-9a-f]{1,1024}/` ）。
`uid`	字符串	由 Communications Mining™ 设置	合并的来源和注释 ID，格式为`source_id.comment_id` 。您不应直接设置此字段，因为它是 Communications Mining 为上传的注释自动生成的。
`created_at`	字符串	由 Communications Mining 设置	ISO-8601 时间戳，其约束与`timestamp`字段相同。您不应直接设置此字段，因为它是创建注释时由 Communications Mining 自动生成的。
`updated_at`	字符串	由 Communications Mining 设置	ISO-8601 时间戳，其约束与`timestamp`字段相同。您不应直接设置此字段，因为它是在更新注释时由 Communications Mining 自动生成的。
`attachments`	`array<Attachment>`	否	由零个或多个附件组成的数组。附件表示附加到注释的文件。

名称	类型	必填	说明
`name`	字符串	是	附件的文件名。
`size`	数字	是	附件文件内容的大小 (以字节为单位)。
`content_type`	字符串	是	附件的媒体文件类型。有关可能值的列表，请查看IANA 媒体类型列表。
`attachment_reference`	字符串	否	用于从附件 API 检索二进制文件内容

其中Message具有以下格式：

名称	类型	必填	说明
`body`	内容	是	包含消息主体文本的对象。
`subject`	内容	否	包含消息主题的对象。
`signature`	内容	否	包含消息签名的对象。
`from`	字符串	否	消息发件人。
`to`	`array<string>`	否	主要收件人数组。
`cc`	`array<string>`	否	副本收件人数组。
`bcc`	`array<string>`	否	密件副本收件人数组。
`sent_at`	字符串	否	指示消息创建时间的 ISO-8601 时间戳。如果时间戳未指定时区，则采用 UTC。
`language`	字符串	否	消息的原始语言。如果提供了此项，则应为“内容”字段同时提供`text`和`translated_from` 。

其中Content具有以下格式：

名称	类型	必填	说明
`text`	字符串	是	如果已提供`language` （而非来源的`language` ），则这应是内容的翻译文本。否则，应使用收集时的原始语言；如果源的`language`中没有该内容，并且源已将`should_translate`设置为`true` ，则对其进行翻译。最多支持 65536 个字符。
`translated_from`	字符串	否	如果提供了`language` （而非来源的`language` ），则应提供内容的原始文本。在未提供`language`的情况下提供此字段将导致发生错误。最多 65536 个字符。

原始电子邮件

查看下表以获取可用的原始电子邮件字段列表。

名称	类型	必填	说明
`headers`	标头	是	包含电子邮件标头的对象。
`body`	正文	是	包含电子邮件正文的对象。

其中Headers具有以下格式：

名称	类型	必填	说明
`raw`	字符串	否	需要`raw`和`parsed` 。原始电子邮件标头，以单个字符串形式提供，每个标头占自己的行。
`parsed`	`映射 < 字符串，字符串	数组>`	否

其中Body具有以下格式：

名称	类型	必填	说明
`plain`	字符串	否	至少需要`plain`和`html` 。电子邮件的明文内容。最多 65536 个字符。
`html`	字符串	否	至少需要`plain`和`html` 。电子邮件的 HTML 内容。

此页面有帮助吗？

前一个核心概念

下一个标签和常规字段

Communications Mining 用户指南

概述​

通过 API 创建的注释​

电子邮件​

线程属性​

支持票证​

集成创建的注释​

电子邮件 (Microsoft Exchange)​

附件和附件内容​

参考​

注释​

原始电子邮件​

此页面有帮助吗？

概述

通过 API 创建的注释

电子邮件

线程属性

支持票证

集成创建的注释

电子邮件 (Microsoft Exchange)

附件和附件内容

参考

注释

原始电子邮件