Document Understanding 活动

手动验证数字化文档

下面的示例说明了如何从图像中手动提取数据，并将输出呈现在单独的文件中。它提供了诸如“数字化文档”或“显示验证站点”之类的活动。您可以在UiPath.IntelligentOCR.Activities包中找到这些活动。

备注：

This workflow is using an older version of the UiPath.IntelligentOCR.Activities package.

Steps:

打开 Studio，并新建一个默认命名为“主要”的“流程”。

备注：
Make sure to add all the needed files (.json files and all the images) inside the project folder.

在“工作流设计器”中添加一个“序列”容器并创建下表中显示的变量：

表格 1. 要创建的变量

	变量类型	默认值
`Text`	字符串
`DOM`	UiPath.DocumentProcessing.Contracts.Dom.Document
`Data`	UiPath.DocumentProcessing.Contracts.Taxonomy.DocumentTaxonomy
`DocumentTaxonomy`	UiPath.DocumentProcessing.Contracts.Taxonomy.DocumentTaxonomy
`TaxonomyJSON`	字符串
`HumanValidated`	UiPath.DocumentProcessing.Contracts.Results.ExtractionResult

在序列中添加“读取文本文件”活动。
- In the Properties panel, add the name of the file, in this case "taxonomy.json", in the FileName field.
- Add the variable TaxonomyJSON in the Content field.
将“分配”活动拖动至“读取文本文件”活动后面。
- Add the variable Data in the To field and the expression DocumentTaxonomy.Deserialize(TaxonomyJSON) in the Value field. This activity builds the taxonomy for extraction.
在“分配”活动之后添加“数字化文档”活动。
- In the Properties panel, add the value 1 in the DegreeOfParallelism field.
- Add the expression "Input\Invoice01.tif" in the DocumentPath field.
- Add the variable DOM in the DocumentObjectModel field.
- Add the variable Text in the DocumentText field.
在“数字化文档”活动中添加Google OCR引擎。
- 在“属性”面板中，向“图像”字段添加“Image”变量。
- Select the check box for the ExtractWords option. This option extracts the on-screen position of all detected words.
- Add the expression "eng" in the Language field.
- Select the option Legacy from the Profile drop-down list.
- 在“缩放比例”字段中添加值“2”。
在“数字化文档”活动之后添加“显示验证站点”活动。
- In the Properties panel, add the variable DOM in the DocumentObjectModel field.
- Add the expression "Input\Invoice01.tif" in the DocumentPath field.
- Add the variable Text in the DocumentText field.
- Add the variable Data in the Taxonomy field.
- Add the variable HumanValidated in the ValidatedExtractionResults field.
在“显示验证站点”活动下方添加“遍历循环”活动。
- In the Properties panel, select the option UiPath.DocumentProcessing.Contracts.Results.ResultsDataPoint from the TypeArgument drop-down list.
- Add the expression HumanValidated.ResultsDocument.Fields in the Values field.
Add a Log Message activity inside the Body of the For Each activity.
- Select the option Info from the Level drop-down list.
- 在“行”字段中添加表达式“item.FieldName”。
将“日志消息”活动拖动到第一个“日志消息”活动下方。
- 在“级别”下拉列表中选择Info选项。
- 在“行”字段中添加表达式“item.Values(0).Value.ToString”。
在“日志消息”活动下添加“写入行”活动。
- 在“文本”字段中添加“""”值。
运行流程。机器人使用智能 OCR 活动手动处理数据并显示结果。

请访问以下链接，将该示例下载为ZIP文件：示例。

此页面有帮助吗？

前一个使用应用程序操作验证文档

下一个使用智能表单提取程序提取基于锚点的数据