UiPath Documentation
document-understanding
latest
false
重要 :
新发布内容的本地化可能需要 1-2 周的时间才能完成。
UiPath logo, featuring letters U and I in white

Document Understanding classic user guide

上次更新日期 2026年4月23日

从表单中提取数据

重要提示:

The aim of this page is to help first time users get familiar with Document UnderstandingTM. For scalable production deployments, we strongly recommend using the Document Understanding Process available in UiPath® Studio under the Templates section.

This quickstart guides you through the steps required to extract information from W-9 forms using the Intelligent Form Extractor. The W-9 forms are used as an example, but the procedure is similar for other types of documents where the data is structured.

从头开始,以下是需要遵循的步骤:

  1. 创建空白流程
  2. 安装所需的活动包
  3. 创建分类
  4. 将文档数字化
  5. 使用智能表单提取程序提取数据
  6. 使用验证站点验证结果
  7. 导出提取结果

1. 创建空白流程

启动 UiPath Studio。

In the HOME backstage view, select Process to create a new project.

系统将显示“新的空白流程”窗口。在此窗口中,输入新项目的名称。如果需要,您还可以添加说明,以便更轻松地对项目进行排序。

Select Create. The new project is opened in Studio.

2. 安装所需的活动包

From the Manage Packages button in the ribbon, besides the core activities packages (UiPath.Excel.Activities,UiPath.Mail.Activities,UiPath.System.Activities,UiPath.UIAutomation.Activities) that are added to the project by default, install the following activities packages:

3. 创建分类

安装包后,请列出必填字段。我们将对以下字段进行数据提取:

  • 1_名称 - Text
  • 2_企业名称 - Text
  • 3a_个人 - Boolean
  • 3b_Ccorp - Boolean
  • 3c_Scorp - Boolean
  • 3d_合作伙伴 - Boolean
  • 3e_TrustEstate - Boolean
  • 3f_LLC - Boolean
  • 3f_LLC 税分类 - Boolean
  • 3g_其他 - Boolean
  • 3g_其他详情 - Boolean
  • 5_地址 - Text
  • 6_邮政编码 - Text
  • 7_帐户编号 - Text
  • TIN_SSN - Text
  • TIN_ETN - Text
  • 认证签名 - Boolean
  • 认证签名日期 - Date

Open Taxonomy Manager and create a group named Structured Documents, a category named Lending Forms, and a document type named W-9. Createlisted fields with user friendly names along with respective data types.

描述分类管理器的屏幕截图。

4. 将文档数字化

In the Main.xaml file, add a Load Taxonomy activity and create a variable for the taxonomy output.

Add a Digitize Document activity with UiPath Document OCR. Provide the input property Document Path and create output variables for Document Text and Document Object Model.

请记住在“UiPath 文档 OCR”活动中添加 Document Understanding API 密钥。

5. 使用智能表单提取程序提取数据

Add a Data Extraction Scope activity and fill in the properties.

Drag and drop the Intelligent Form Extractor within it. The endpoint should be auto-populated with the Intelligent Form Extractor endpoint, namely https://du.uipath.com/svc/intelligentforms. Provide the Document Understanding API key.

Once that is done, to create a new template, select Manage Templates > Create Template. A pop-up window opens.

在“文档类型”下,选择之前创建的 W-9 文档类型。

文档名称下,输入模板的名称。

Under Template document (native PDF if possible), attach a template document where you are going to map the field positions.

Under OCR Engine, select again the UiPath Document OCR. Just like before, the endpoint should be auto-populated, namely https://du.uipath.com/ocr, and you just need to provide the API Key.

Select Configure to move to the next step. The Template Manager pop-up window opens.

Here, we will need to select the areas where we want Intelligent Form Extractor to search for our fields. Configure them by following the steps detailed here. You also have the option of using anchors for your fields. More information on anchors here.

您应该得到如下结果:

描述 Template Manager 中 W-9 表单的屏幕截图。

Select Save. In this screen, you can define the handwritten or signature fields, where applicable. You can also define synonyms for Boolean fields. Close the window after you are done.

描述模板管理器的屏幕截图。

下一步是配置提取程序,这意味着让智能表单提取程序处理所有类型为 W-9 的文档。

描述“配置提取”程序的屏幕截图。

6. 使用验证站点验证结果

To check the results through Validation Station, drag and drop the Present Validation Station activity and provide the input details.

“显示验证站点”活动的屏幕截图。

7. 导出提取结果

To export the extraction results, validated or not, drag and drop an Export Extraction Results activity to the end of your workflow. This outputs the results into a DataSet that contains multiple tables, which could then be written to an Excel file or be used directly in a downstream process.

描述“导出提取结果”活动的屏幕截图。

下载示例

Download this sample project to execute the W-9 with Intelligent Form Extractor workflow using this link.

此页面有帮助吗?

连接

需要帮助? 支持

想要了解详细内容? UiPath Academy

有问题? UiPath 论坛

保持更新