- Overview
- Document Understanding Process
- Quickstart tutorials
- Framework components
- ML packages
- Overview
- Document Understanding - ML package
- DocumentClassifier - ML package
- ML packages with OCR capabilities
- 1040 - ML package
- 1040 Schedule C - ML package
- 1040 Schedule D - ML package
- 1040 Schedule E - ML package
- 1040x - ML package
- 3949a - ML package
- 4506T - ML package
- 709 - ML package
- 941x - ML package
- 9465 - ML package
- 990 - ML Package - Preview
- ACORD125 - ML package
- ACORD126 - ML package
- ACORD131 - ML package
- ACORD140 - ML package
- ACORD25 - ML package
- Bank Statements - ML package
- Bills Of Lading - ML package
- Certificate of Incorporation - ML package
- Certificate of Origin - ML package
- Checks - ML package
- Children Product Certificate - ML package
- CMS 1500 - ML package
- EU Declaration of Conformity - ML package
- Financial Statements - ML package
- FM1003 - ML package
- I9 - ML package
- ID Cards - ML package
- Invoices - ML package
- Invoices China - ML package
- Invoices Hebrew - ML package
- Invoices India - ML package
- Invoices Japan - ML package
- Invoices Shipping - ML package
- Packing Lists - ML package
- Passports - ML package
- Payslips - ML package
- Purchase Orders - ML package
- Receipts - ML Package
- Remittance Advices - ML package
- UB04 - ML package
- Utility Bills - ML package
- Vehicle Titles - ML package
- W2 - ML package
- W9 - ML package
- Other Out-of-the-box ML Packages
- Public Endpoints
- Hardware requirements
- Pipelines
- Document Manager
- OCR services
- Deep Learning
- Insights dashboards
- Document Understanding deployed in Automation Suite
- Document Understanding deployed in AI Center standalone
- Licensing
- Activities
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentProcessing.Contracts
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities
Document Understanding User Guide
Overview
There are several ways in which you can consume Document UnderstandingTM capabilities:
- The DocumentUnderstanding.Activities package is available in Studio Web, Studio X, and Studio Desktop and is pre-configured for you either when you create a new automation starting from a file, or if you continue your journey after publishing a project version.
- Using the IntelligentOCR package, which is designed for Windows and Windows Legacy projects, and pre-configured in the Document Understanding process template.
- Using cloud API calls, consuming Document Understanding as a service via the programming language of your choice.
If you're an RPA developer, you can use DocumentUnderstanding.Activities in your cloud projects. Using Document Understanding allows you to handle all data about a document within a single input/output object, named Document Data. Also, Document Understanding activities don't require setting the taxonomy of Document Types, so you can easily leverage out-of-the-box-models.
You can easily setup an automation using some of the following activities, through the Extraction Automation Builder available in Document Understanding, the Marketplace, and Studio Web:
Keep in mind that Document Understanding activities don't support the following capabilities, yet: splitting, training (fine-tuning of models), production/developer tenant support, on-premises support, and multiple extraction methods per Document Type.
If you start new automation projects that leverage modern projects (created using the Active Learning experience), you can use DocumentUnderstanding.Activities.
As an RPA developer that wants to try the IntelligentOCR package, you can use different extraction and classification models based on your needs. If one model doesn't suit your needs, you can use other extractors or classifiers as a backup option. You can also modify the taxonomy, Document Object Model (DOM), and extraction results using RPA code during runtime.
However, there is a longer learning curve required for using IntelligentOCR, because its flexibility involves complexity, while working with multiple activities and data types.
With IntelligentOCR, you can integrate your own classifier, extractor, or OCR engine. Visit Document Processing Code Samples to check implementation examples.
You can use API calls as an alternative to the robotic process automation (RPA) approach. API calls allow you to retrieve detailed information about your project, including the extractors and classifiers used, facilitate the use of digitization APIs, classify and extract data from documents using both specialized and generative models, and validate previously digitized, classified, and extracted information.
For consuming the APIs, you can use any programming/scripting language (since the calls are made using HTTP), including RPA.
You can access the APIs via Swagger: In the toolbar of the Document Understanding service, search the REST API dropdown list, and select Framework.