- Overview
- Building models
- Consuming models
- ML packages
- 1040 - ML package
- 1040 Schedule C - ML package
- 1040 Schedule D - ML package
- 1040 Schedule E - ML package
- 1040x - ML package
- 3949a - ML package
- 4506T - ML package
- 709 - ML package
- 9465 - ML package
- ACORD125 - ML package
- ACORD126 - ML package
- ACORD131 - ML package
- ACORD140 - ML package
- ACORD25 - ML package
- Bank Statements - ML package
- Bills Of Lading - ML package
- Certificate of Incorporation - ML package
- Certificate of Origin - ML package
- Checks - ML package
- Children Product Certificate - ML package
- CMS 1500 - ML package
- EU Declaration of Conformity - ML package
- Financial Statements - ML package
- FM1003 - ML package
- I9 - ML package
- ID Cards - ML package
- Invoices - ML package
- Invoices Australia - ML package
- Invoices China - ML package
- Invoices Hebrew - ML package
- Invoices India - ML package
- Invoices Japan - ML package
- Invoices Shipping - ML package
- Packing Lists - ML package
- Payslips - ML package
- Passports - ML package
- Purchase Orders - ML package
- Receipts - ML Package
- Remittance Advices - ML package
- UB04 - ML package
- Utility Bills - ML package
- Vehicle Titles - ML package
- W2 - ML package
- W9 - ML package
- Public endpoints
- Supported languages
- Data and security
- Licensing and Charging Logic
- How to
Document Understanding User Guide
Measure
You can check the overall status of your project and check the areas with improvement potential from the Measure section.
The main measurement on the page is the overall Project score.
This measurement factors in the classifier and extractor scores for all document types. The score of each factor corresponds to the model rating and can be viewed in Classification Measure and Extraction Measure respectively.
- Poor (0-49)
- Average (50-69)
- Good (70-89)
- Excellent (90-100)
Regardless of the model score, it is up to you to decide when to stop training, depending on your project needs. Even if a model is rated as Excellent, that doesn't mean that it will meet all business requirements.
The Classification score factors in the performance of the model as well as the size and quality of the dataset.
- Factors: Provides recommendations on how to improve the performance of your model. You can get recommendations on dataset size or trained model performance for each document type.
- Metrics: Provides useful metrics, such as the number of train and test documents, precision, accuracy, recall, and F1 score for each document type.
The Extraction score factors in the overall performance of the model as well as the size and quality of the dataset. This view is split into document types. You can also go straight to the Annotate view of each document type by clicking Annotate.
- Factors: Provides recommendations on how to improve the performance of your model. You can get recommendations on dataset size (number of uploaded documents, number of annotated documents) or trained model performance (fields accuracy) for the selected document type.
- Dataset: Provides information about the documents used for training the model, the total number of imported pages, and the total number of labelled pages.
- Metrics: Provides useful information and metrics, such as the field name, the number of training status, and accuracy for the selected document type. You can also access advanced metrics for your extraction models using the Download advanced metrics button. This feature allows you to download an Excel file with detailed metrics and model results per batch.
The Dataset tab helps you build effective datasets by providing feedback and recommendations of the steps needed to acchieve good accuracy for the trained model.
There are three dataset status levels exposed in the Management bar:
- Red - More labelled training data is required.
- Orange - More labelled training data is recommended.
- Green - The needed level of labelled training data is achieved.
If no fields are created in the session, the dataset status level is grey.