- 概述
- Document Understanding 流程
- 快速入门教程
- 框架组件
- ML 包
- 概述
- Document Understanding - ML 包
- DocumentClassifier - ML 包
- 具有 OCR 功能的 ML 包
- 1040 - ML 包
- 1040 附表 C - ML 包
- 1040 附表 D - ML 包
- 1040 附表 E - ML 包
- 1040x - ML 包
- 3949a - ML 包
- 4506T - ML 包
- 709 - ML 包
- 941x - ML 包
- 9465 - ML 包
- 990 - ML 包 - 预览
- ACORD125 - ML 包
- ACORD126 - ML 包
- ACORD131 - ML 包
- ACORD140 - ML 包
- ACORD25 - ML 包
- 银行对账单 - ML 包
- 提单 - ML 包
- 公司注册证书 - ML 包
- 原产地证书 - ML 包
- 检查 - ML 包
- 儿童产品证书 - ML 包
- CMS1500 - ML 包
- 欧盟符合性声明 - ML 包
- 财务报表 (Financial statements) - ML 包
- FM1003 - ML 包
- I9 - ML 包
- ID Cards - ML 包
- Invoices - ML 包
- 中国发票 - ML 包
- 希伯来语发票 - ML 包
- 印度发票 - ML 包
- 日本发票 - ML 包
- 装运发票 - ML 包
- 装箱单 - ML 包
- 护照 - ML 包
- 工资单 - ML 包
- 采购订单 - ML 包
- 收据 - ML 包
- 汇款通知书 - ML 包
- UB04 - ML 包
- 水电费账单 - ML 包
- 车辆所有权证明 - ML 包
- W2 - ML 包
- W9 - ML 包
- 其他开箱即用的 ML 包
- 公共端点
- 硬件要求
- 管道
- Document Manager
- OCR 服务
- 支持的语言
- 深度学习
- Insights 仪表板
- 部署在 Automation Suite 中的 Document Understanding
- 在 AI Center 独立版中部署的 Document Understanding
- 许可
- Activities (活动)
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentProcessing.Contracts
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.Intelligent OCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities

Document Understanding 用户指南
ML 包离线安装
入门指南
根据要使用的模型,您需要满足以下条件:
- 对于 2022.10 及更高版本的模型:
- Download the needed Document UnderstandingTM bundle. Here are the links for all the available bundles. The du bundle contains information about all models included into a specific version. For example, the
dusemistructured-2024.10.0.tar.gzcontains information about all out-of-the-box pre-trained ML Packages included in the 2024.10.0 version.
- Download the needed Document UnderstandingTM bundle. Here are the links for all the available bundles. The du bundle contains information about all models included into a specific version. For example, the
- 对于 2022.4 及更早版本的模型(
python37duv3和python37duv4):- 所有 ML 包都以
.zip文件的形式提供,并在 AI Center 中作为自定义包直接上传。要下载模型,请联系您的客户经理、CSM 或支持团队,以获取每个包的下载链接。 - Download the needed Document Understanding bundle. Here are the links for all the available bundles.
- 所有 ML 包都以
Updated versions for the dulv bundles are no longer released. The latest version for the dulv bundles is 2023.10.4.
安装离线捆绑包
离线安装要求在命令行中将下载的 DU 捆绑包重命名为 du-ondemand.tar.gz。例如,如果您下载了名为 dusemistructured-2024.10.0.tar.gz 的 DU 捆绑包,则需要在安装时将其重命名为 du-ondemand.tar.gz。
-
对于 Windows 计算机,请通过捆绑包链接直接下载,并将文件重命名为
du-ondemand.tar.gz -
对于 Linux 计算机,请在可以访问互联网的计算机上使用以下命令下载所需的捆绑包:
wget -O ~/<bundle-name.tar.gz> 'bundle-link'wget -O ~/<bundle-name.tar.gz> 'bundle-link'以下示例说明了如何下载适用于 Linux 的所需捆绑包:
wget -O ~/du-ondemand.tar.gz 'https://download.uipath.com/automation-suite/2024.10.0/dusemistructured-2024.10.0.tar.gz'wget -O ~/du-ondemand.tar.gz 'https://download.uipath.com/automation-suite/2024.10.0/dusemistructured-2024.10.0.tar.gz' -
将以下捆绑包复制到集群的主计算机(进行安装的位置)上的
/uipath/tmp文件夹:scp ~/<bundle-name.tar.gz> <username>@<node dns>:/uipath/tmp/scp ~/<bundle-name.tar.gz> <username>@<node dns>:/uipath/tmp/ -
连接到此主计算机并加载捆绑包:
./configureUiPathAS.sh registry upload --optional-offline-bundle "/uipath/tmp/du.tar.gz" --offline-tmp-folder "/uipath/tmp"./configureUiPathAS.sh registry upload --optional-offline-bundle "/uipath/tmp/du.tar.gz" --offline-tmp-folder "/uipath/tmp"
Upload the model to AI Center
After downloading and installing the models, follow the steps described in the ML packages offline installation page from the AI Center User Guide to upload them to AI Center. Both ML package zip and metadata json files are needed for this procedure.
表单提取程序和智能关键字分类器
Access Form Extractor and Intelligent Keyword Classifier, with the following public URL:
<FQDN>/du_/svc/formextractor<FQDN>/du_/svc/intelligentkeywords
使用公共 URL 时,请用实际的环境信息替换<FQDN>占位符。例如,在工作流中使用时, <FQDN>/du_/svc/formextractor将变为https://servicefabricserver.domain.com/du_/svc/formextractor 。
将 Document Understanding™ 捆绑包上传到外部 Docker 注册表
请按照以下步骤将 Document Understanding 捆绑包上传到外部 Docker 注册表:
- Pull the desired DU image from the UiPath® registry hosted on registry.uipath.com
- 根据您的 Docker 注册表名称重命名映像主机。
- 将映像推送到外部 Docker 注册表
拉取所需的 Document Understanding 映像
Pull the images from UiPath® registry by running these commands:
docker pull <uipath_registry_server>/<image_name>
docker pull <uipath_registry_server>/<image_name>
以下示例说明了如何从名为 registry.uipath.com 的注册表中拉取 UiPath 文档 OCR 捆绑包的映像:
docker pull registry.uipath.com/aicenter/du-doc-ocr:v24.10-10.3-rc02
docker pull registry.uipath.com/aicenter/du-doc-ocr:v24.10-10.3-rc02
重命名映像主机
通过运行以下命令,重命名映像主机:
docker tag <uipath_registry_server>/<image_name> <your_registry_server>/<image_name>
docker tag <uipath_registry_server>/<image_name> <your_registry_server>/<image_name>
以下示例说明了如何将 UiPath 文档 OCR 捆绑包的映像从名为 registry.uipath.com 的注册表重命名为名为 registry.mycompany.com 的注册表:
docker tag registry.uipath.com/aicenter/du-doc-ocr:v24.10-10.3-rc02 registory.mycompany.com/aicenter/du-doc-ocr:v23.10.0
docker tag registry.uipath.com/aicenter/du-doc-ocr:v24.10-10.3-rc02 registory.mycompany.com/aicenter/du-doc-ocr:v23.10.0
将映像推送到外部 Docker 注册表
通过运行以下命令,将映像推送到外部 Docker 注册表:
docker push <your_registry_server>/<image_name>
docker push <your_registry_server>/<image_name>
以下示例说明了如何将 UiPath 文档 OCR 捆绑包的映像推送到外部 Docker 注册表:
docker push registory.mycompany.com/aicenter/du-doc-ocr:v24.10.0
docker push registory.mycompany.com/aicenter/du-doc-ocr:v24.10.0
每个 Document Understanding 捆绑包的映像
2024.10.8
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-2.24-rc07 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-2.24-rc07 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-3.10-rc03 du/uipath-ocr-extended:v24.10-3.10-rc03 du/du-extended-ocr-reporting:v24.10-3.10-rc03 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-2.24-rc07 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-2.24-rc07 |
2024.10.7
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-11.18-rc02 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-11.18-rc02 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-12.10-rc02 du/uipath-ocr-extended:v24.10-12.10-rc02 du/du-extended-ocr-reporting:v24.10-12.10-rc02 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-11.18-rc02 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-11.18-rc02 |
2024.10.6
| Document Understanding 捆绑包 | 图像 |
|---|---|
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-10.31-rc02 du/uipath-ocr-extended:v24.10-10.31-rc02 du/du-extended-ocr-reporting:v24.10-10.31-rc02 |
2024.10.5
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-8.25-rc02 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-8.25-rc02 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-9.11-rc02 du/uipath-ocr-extended:v24.10-9.11-rc02 du/du-extended-ocr-reporting:v24.10-9.11-rc02 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-8.25-rc02 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-8.25-rc02 |
2024.10.4
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-5.15-rc03 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-5.15-rc03 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-6.05-rc02 du/uipath-ocr-extended:v24.10-6.05-rc02 du/du-extended-ocr-reporting:v24.10-6.05-rc02 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-5.15-rc03 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-5.15-rc03 |
2024.10.3
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-3.11-rc01 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-3.11-rc01 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-4.20-rc02 du/uipath-ocr-extended:v24.10-4.20-rc02 du/du-extended-ocr-reporting:v24.10-4.20-rc02 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-3.11-rc01 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-3.11-rc01 |
2024.10.2
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-1.27-rc02 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-1.27-rc02 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-2.10-rc01 du/uipath-ocr-extended:v24.10-2.10-rc01 du/du-extended-ocr-reporting:v24.10-2.10-rc01 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-1.27-rc02 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-1.27-rc02 |
2024.10.1
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-11.21-rc12 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-11.21-rc12 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-12.03-rc04 du/uipath-ocr-extended:v24.10-12.03-rc04 du/du-extended-ocr-reporting:v24.10-12.03-rc04 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-11.21-rc12 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-11.21-rc12 |
2024.10.0
| Document Understanding 捆绑包 | 图像 |
|---|---|
| UiPath 文档 OCR | aicenter/du-doc-ocr:v24.10-10.03-rc02 |
| UiPathDocumentOCR_CPU | aicenter/du-doc-ocr-cpu:v24.10-10.03-rc02 |
| 扩展语言 OCR | du/du-extended-ocr-proxy:v24.10-10.26-rc01 du/uipath-ocr-extended:v24.10-10.26-rc01 du/du-extended-ocr-reporting:v24.10-10.26-rc01 |
| 文档分类器 | aicenter/du-ml-document-type-text-classifier:v24.10-10.03-rc02 |
| Out-of-the-box Pre-trained ML Packages | aicenter/du-semistructured:v24.10-10.17-rc02 |