- Overview
- Document Understanding Process
- Quickstart Tutorials
- Framework Components
- Data Extraction Validation Overview
- Validation Station
- Data Extraction Validation Related Activities
- ML Packages
- Pipelines
- Document Manager
- OCR Services
- Document Understanding deployed in Automation Suite
- Document Understanding deployed in AI Center standalone
- Deep Learning
- Licensing
- References
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.DocumentProcessing.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities
Data Extraction Validation Overview
After automatic data extraction, one optional (but highly recommended) step is that of extracted data validation.
This refers to a human review step, in which knowledge workers can review the automatically extracted results and correct them when necessary.
Using Data Extraction Validation ensures that the structured data now available is 100% correct.
It is strongly recommended to use the Data Extraction Validation components when:
- you need 100% accuracy on the data,
-
you have no other way to double-check the automatically extracted information from other sources of truth
- e.g., you can check a certain Name or Address that equals a Name or Address already confirmed and existing in a database, etc.
-
you do not have sufficient synthetic checks you can use on data consistency
-
e.g., you can check that line items add up to a total; you can check that an ID number checksum is correct, etc.
Note:Deciding whether to add Validation or not?
Our strong recommendation is that, if possible, to add the Validation step, if you need 100% accuracy.
If this is not an option for all documents, then:
- try to double-check as much of the information as possible
- try to decide on specific confidence thresholds that the business use case can accept for certain fields
- make sure to always check both Extraction Confidence as well as OCR Confidence for a given value before making your decision.
-
Validating the automatically extracted data can be done by a human input through the use of Validation Station.
The Validation Station is available both
- as an attended activity, through the use of the Present Validation Station activity, or
- as Action Center tasks, through the use of the Create Document Validation Action and Wait for Document Validation Action and Resume activities.