UiPath Documentation
document-understanding
latest
false
重要 :
新发布内容的本地化可能需要 1-2 周的时间才能完成。
UiPath logo, featuring letters U and I in white

Document Understanding 用户指南

上次更新日期 2026年4月23日

衡量

You can check the overall status of your project and check the areas with improvement potential from the Measure section.

项目衡量指标

The main measurement on the page is the overall Project score.

This measurement factors in the classifier and extractor scores for all document types. The score of each factor corresponds to the model rating and can be viewed in Classification Measure and Extraction Measure respectively.

模型评分是一项功能,旨在帮助您为分类模型的性能实现可视化。具体表现形式为 0 到 100 之间的模型分数,如下所示:

  • 差 (0-49)
  • 一般 (50-69)
  • 良好 (70-89)
  • 非常好 (90-100)

无论模型分数如何,您都可以根据项目需求决定何时停止训练。即使模型被评为“优秀”,也不意味着它将满足所有业务要求。

分类衡量标准

“分类”分数影响模型的性能以及数据集的大小和质量。

备注:

The Classification score is only available if you have more than one document type created.

If you select Classification, two tabs are displayed on the right side:

  • Factors: Provides recommendations on how to improve the performance of your model. You can get recommendations on dataset size or trained model performance for each document type.

  • Metrics: Provides useful metrics, such as the number of train and test documents, precision, accuracy, recall, and F1 score for each document type.

    “分类度量”界面的屏幕截图。

提取衡量指标

The Extraction score factors in the overall performance of the model as well as the size and quality of the dataset. This view is split into document types. You can also go straight to the Annotate view of each document type by selecting Annotate.

If you select any of the available document types from the Extraction view, three tabs are displayed on the right side:

  • Factors: Provides recommendations on how to improve the performance of your model. You can get recommendations on dataset size (number of uploaded documents, number of annotated documents) or trained model performance (fields accuracy) for the selected document type.

  • Dataset: Provides information about the documents used for training the model, the total number of imported pages, and the total number of labelled pages.

  • Metrics: Provides useful information and metrics, such as the field name, the number of training status, and accuracy for the selected document type. You can also access advanced metrics for your extraction models using the Download advanced metrics button. This feature allows you to download an Excel file with detailed metrics and model results per batch.

    “提取度量”界面的屏幕截图。

数据集诊断

The Dataset tab helps you build effective datasets by providing feedback and recommendations of the steps needed to achieve good accuracy for the trained model.

“数据集度量”界面的屏幕截图。

“管理”栏中显示了三个数据集状态级别:

  • Red - More labelled training data is required.
  • Orange - More labelled training data is recommended.
  • Light green - Labelled training data is within recommendations.
  • Dark green - Labelled training data is within recommendations. However, more data might be needed for underperforming fields.

如果会话中未创建任何字段,则数据集状态级别为灰色。

比较模型

You can compare the performance of two versions of a classification or extraction model from the Measure section.

分类模型比较

To compare the performance of two versions of a classification model, first navigate to the Measure section. Then, select Compare model for the classification model you are interested in.

您可以从每列顶部的下拉列表中选择要比较的版本。 系统默认选中左侧的当前版本(即最新可用版本),而右侧为最新发布的版本。

Figure 1. Classification model comparison

“分类模型比较”界面的屏幕截图。

比较分类模型依赖于四个关键指标:

  • 精度:正确预测的正实例与预测为正的实例总数的比率。精度高的模型意味着误报率较低。
  • 准确性:正确预测的样本数(包括真正例和真负例)占样本总数的比率。
  • 召回率:正确识别的真正例占比。
  • F1 分数:精度和召回率的几何均值,旨在达到这两个指标的平衡。作用是在误报和漏报之间进行权衡。

The order of document types displayed is the one used in the latest version from the comparison. If a document type is not available in one of the compared versions, the values for each measure are replaced with N/A.

备注:

If a field was removed in the current version but it was available in the older version before the Compare model feature was available, the name is replaced with Unknown.

提取模型比较

To compare the performance of two versions of an extraction model, first navigate to the Measure section. Then, select Compare model for the extraction model you are interested in.

您可以从每列顶部的下拉列表中选择要比较的版本。 系统默认选中左侧的当前版本(即最新可用版本),而右侧为最新发布的版本。

Figure 2. Extraction model comparison

“提取模型比较”界面的屏幕截图。

对提取模型的比较有赖于以下重要指标:

  • 字段名称:标注字段的名称。
  • 内容类型:字段的内容类型:
    • 字符串
    • 数字
    • 日期
    • 电话
    • ID 编号
  • 评分:模型分数,旨在帮助您可视化所提取字段的表现。
  • 准确度:模型做出的正确预测在预测总数中所占比例。

The order of field names displayed is the one used in the latest version from the comparison. If a field name is not available in one of the compared versions, the values for each measure are replaced with N/A.

备注:

If a field was removed in the current version but it was available in the older version before the Compare model feature was available, the name is replaced with Unknown.

You can also compare the field score for tables from the Table section.

You can download the advanced metrics file for each version from the comparison page from the Download advanced metrics button.

  • 项目衡量指标
  • 分类衡量标准
  • 提取衡量指标
  • 数据集诊断
  • 比较模型
  • 分类模型比较
  • 提取模型比较

此页面有帮助吗?

连接

需要帮助? 支持

想要了解详细内容? UiPath Academy

有问题? UiPath 论坛

保持更新