Automation Suite 安装指南

上次更新日期 2024年11月21日

Document Understanding 配置文件

documentunderstanding 是 Automation Suite 配置文件 cluster_config.json 中的一个属性。它包含控制 Document Understanding 服务行为的可配置值。安装程序会生成默认值。可以进行其他更改以进一步配置 Document Understanding 服务。如果您需要更改与 Document Understanding 相关的任何设置，则可以编辑 cluster_config.json 中的 documentunderstanding 部分，并且可以重新运行安装程序。

或者，可以在 ArgoCD 中的 UiPath 应用程序中进行相同的更改。

cluster_config.json

Document Understanding 配置

"documentunderstanding": {
    "enabled": Boolean,
    "datamanager": { 
      "sql_connection_str" : "String"
    }
    "handwriting": {
      "enabled": Boolean,
      "max_cpu_per_pod": "Number"
    }
  }"documentunderstanding": {
    "enabled": Boolean,
    "datamanager": { 
      "sql_connection_str" : "String"
    }
    "handwriting": {
      "enabled": Boolean,
      "max_cpu_per_pod": "Number"
    }
  }

备注：

仅当您要用自己的数据库覆盖默认数据库时，数据管理器 SQL 连接字符串才可选。

在线安装时始终启用手写功能。

完整配置示例

"documentunderstanding": {
    "enabled": true,
    "datamanager": {
      "sql_connection_str": "mssql+pyodbc://testadmin:myPassword@mydev-sql.database.windows.net:1433/datamanager?driver=ODBC+Driver+17+for+SQL+Server",
    },
    "handwriting": {
      "enabled": true,
      "max_cpu_per_pod": "2"
    }
  }"documentunderstanding": {
    "enabled": true,
    "datamanager": {
      "sql_connection_str": "mssql+pyodbc://testadmin:myPassword@mydev-sql.database.windows.net:1433/datamanager?driver=ODBC+Driver+17+for+SQL+Server",
    },
    "handwriting": {
      "enabled": true,
      "max_cpu_per_pod": "2"
    }
  }

注意：max_cpu_per_pod 的默认值为 2，但可以根据您的需求进行调整。有关如何执行此操作的更多信息，请参阅（可选）每个 Pod 的最大 CPU 参数部分。

可配置的值

datamanager.sql_connection_str

Data Manager 的连接字符串
必填项：False。
此属性由安装程序生成并填充，除非要覆盖默认连接字符串，否则不需要设置它。有关连接到 SQL 的更多详细信息，请参阅数据库配置页面。

手写

手写识别功能的设置（智能表单提取程序的一部分）
必填项：False。

handwriting.enabled

将此设置为 True 可创建执行手写识别所需的资源。要使用智能表单提取程序，此项必须为 True。
必填项：False
对于在线安装，始终启用此属性，而对于离线（脱机）安装则始终禁用。对于离线安装，您需要在启用手写之前安装 Document Understanding 离线捆绑包。

handwriting.max_cpu_per_pod

每个容器允许使用的最大 CPU 数量。建议值为 2。
必填项：False。
默认值: 2。

如果您计划使用具有手写检测功能的智能表单提取程序，则可能需要调整 handwriting.max_cpu_per_pod 参数以提高处理能力。

计算正确的尺寸需要考虑以下因素：

文档总量/年 = V
预期手写碎片数/文档 = S
工作流处理文档的天数（工作日、所有天数、周末等）= d
工作流处理文档的小时数 = h
CPU 数量 = (V x S / (d x h)) / 1500

例如，如果您预计在一年内使用智能表单提取程序检测 100 万个文档，并在工作日的 00:00 至 08:00（8 小时）运行平均 50 个碎片，则计算公式为：

Number of CPUs = (1,000,000 x 50 / (250 x 8)) / 1500
               = 25,000 / 1500
               = 17 CPUsNumber of CPUs = (1,000,000 x 50 / (250 x 8)) / 1500
               = 25,000 / 1500
               = 17 CPUs