- Overview
- Getting started
- Activities
- Insights dashboards
- Document Understanding Process
- Quickstart tutorials
- Framework components
- ML packages
- Overview
- Document Understanding - ML package
- DocumentClassifier - ML package
- ML packages with OCR capabilities
- 1040 - ML package
- 1040 Schedule C - ML package
- 1040 Schedule D - ML package
- 1040 Schedule E - ML package
- 1040x - ML package
- 3949a - ML package
- 4506T - ML package
- 709 - ML package
- 941x - ML package
- 9465 - ML package
- ACORD131 - ML package
- ACORD140 - ML package
- ACORD25 - ML package
- Bank Statements - ML package
- Bills Of Lading - ML package
- Certificate of Incorporation - ML package
- Certificate of Origin - ML package
- Checks - ML package
- Children Product Certificate - ML package
- CMS 1500 - ML package
- EU Declaration of Conformity - ML package
- Financial Statements - ML package
- FM1003 - ML package
- I9 - ML package
- ID Cards - ML package
- Invoices - ML package
- Invoices Australia - ML package
- Invoices China - ML package
- Invoices Hebrew - ML package
- Invoices India - ML package
- Invoices Japan - ML package
- Invoices Shipping - ML package
- Packing Lists - ML package
- Payslips - ML package
- Passports - ML package
- Purchase Orders - ML package
- Receipts - ML Package
- Remittance Advices - ML package
- UB04 - ML package
- Utility Bills - ML package
- Vehicle Titles - ML package
- W2 - ML package
- W9 - ML package
- Other Out-of-the-box ML Packages
- Public endpoints
- Traffic limitations
- OCR Configuration
- Pipelines
- OCR services
- Supported languages
- Deep Learning
- Licensing
Document Understanding User Guide
One Click Extraction
Use the One Click Extraction feature to easily train document extractors straight from the Document UnderstandingTM interface. This feature allows bypassing the need for manually creating Datasets, Pipelines, and ML Skills in AI Center with the help of a new user experience within Document Understanding.
Make sure that your Document Understanding project is linked to AI Center before using this functionality.
You can use One Click Extraction functionality to create a new extractor based on an existing semi-structured AI document type by clicking the New Extractor button.
The New Extractor button opens a drop-down with two options: Automated Training and Manual Training.
Use the Automated Training option for training an extractor straight in Document Understanding. Once you choose this option, you have to add an Extractor Name, select the preferred Document Type, select the Model that you want to use, and its version, enable or disable the Use GPU option and select the version of the model. When finished, click on the Train button.
Keep in mind that before starting training an extractor, you need to have at least ten documents labelled in the session that you are planning on using.
This functionality automatically creates a new Dataset in AI Center with the name previously given by you in the Extractor Name field of the Train extraction dataset popup window.
Details
You can see more details about the created Automated Training action by clicking on the name of the extractor from the Extractors page, or by clicking on the actions menu, and selecting the Details option.
Here's a list with all the information provided by the Details option:
- Training set - Specifies the number of documents and number of pages processed.
- Pages Extracted - Specifies the number of extracted pages.
- F1 Score - Provides an accuracy score percentage for the dataset.
- Status - Provides the status of the extraction action.
- Document types - Provides the list of Document types used for the action.
- Package Name - Provides the name of the used ML Package.
- Package Version - Provides the version of the used ML Package model.
- ML Skill details - Provides the URL of the ML Skill created for the dataset. You can copy it and use it in your workflow.
- Dataset link - Provides the public endpoint URL of the created (public) dataset.
- Pipeline details - Provides the URL of the pipeline created for the dataset.
- View/Hide Logs - Provides a list with all the logs of the created dataset. You can copy it and use it when needed.
Use the Manual Training option to export a dataset to AI Center and then train it in AI Center. Once you choose this option, you have to add a Dataset Name and select the preferred Document Type. When finished, click on the Export button.
Details
You can see more details about the created Manual Training action by clicking on the name of the extractor from the Extractors page, or by clicking on the actions menu, and selecting the Details option.
Here's a list with all the information provided by the Details option:
- Training set - Specifies the number of documents and number of pages processed.
- Pages Extracted - Specifies the number of extracted pages.
- F1 Score - Provides an accuracy score percentage for the dataset.
- Status - Provides the status of the extraction action.
- Document types - Provides the list of Document types used for the action.
- Package Name - Provides the name of the used ML Package.
- Package Version - Provides the version of the used ML Package model.
- ML Skill details - Provides the URL of the ML Skill created for the dataset. You can copy it and use it in your workflow.
- Dataset link - Provides the public endpoint URL of the created (public) dataset.
- Pipeline details - Provides the URL of the pipeline created for the dataset.
- View/Hide Logs - Provides a list with all the logs of the created dataset. You can copy it and use it when needed.
You can check the status of all your extraction actions by using the Extractors tab from your project page.
Once the Extractors tab is selected, you can see five different columns, each presenting information about the created classification actions. You can sort them individually in ascending or descending alphabetical order, or leave them in their default state, organized by creation date, with the latest on top:
- Name - Displays the name of the classification actions.
- Type - Displays the type of classification action (export or train).
- Document Type - Displays the used Document type.
- Status - Displays the status of the action. There are multiple available statuses for each action. Check the table below for more details.
- Creation date - Displays the creation date.
- Refresh - Refreshes the statuses for all actions, displaying the most recent ones.
Status |
Description |
Classify Option |
---|---|---|
Available |
The action was successfully executed. |
Automated Training |
InProgress |
The action is still executed. |
Automated Training |
ExportCompleted |
The action was successfully executed. |
Manual Training |
ExportInProgress |
The action is still executed. |
Manual Training |
NotStarted |
The execution of the action didn't start yet. |
Automated Training Manual Training |
OutOfSync |
The status from Document Understanding is not syncronized with the one from AI Center. Navigate to AI Center and check the status of the ML Skill corresponding to the extractor you have created. If the ML Skill has become undeployed, deploy it again. |
Automated Training Manual Training |
Suspended |
The action was paused. |
Automated Training Manual Training |
The action menu is available on the right side and has the following options available, once opened:
- Copy URL - Allows you to copy the URL of the public endpoint created with the Automated Training action.
- Details - Provides information about the created action.
- Delete - Deletes the created action from both Document Understanding and AI Center.
- Stop ML Skill - Stops the ML Skill for the Automated Training action.