This pack contains the infrastructure for enabling document processing flows using a complete, open, extensible approach.
It allows you to:
- Digitize documents, using the Digitize Document activity. This retrieves the text from any PDF or image, using, only if necessary, the OCR engine of your choice.
- Classify documents, using the Classify Document Scope activity. This allows identifying what type of document a file is by using any classification algorithm. The Keyword Based Classifier activity is the first such classifier, targeting classification for titled documents. The FlexiCapture Classifier, embedding the Abbyy FlexiCapture technology is also incorporated into our product.
- Train classifiers, using the Train Classifiers Scope activity. This empowers the closing of the feedback loop to any classification algorithm capable of learning (the Keyword Based Classifier for example).
- Extract data from documents, using the Data Extraction Scope activity. This allows the usage of any data extraction algorithm for identifying different fields in a classified document. The FlexiCapture Extractor is one such example, incorporating the Abbyy FlexiCapture technology into our product. The Regex Based Extractor is another example of a basic data extractor that applies regular expression matching to identify the best candidates for a required value.
- Train extractors, using the Train Extractors Scope activity. This empowers the closing of the feedback loop to any data extraction algorithm capable of learning.
- Validate automatic classification and data extraction, using the Present Validation Station attended activity, which presents a document processing specific user interface for data validation and correction.
- Export extracted information, using the Export Extraction Results activity. This allows you to export the complex structure of extracted data to a simple DataSet (collection of DataTables).
If you want to use the UiPath.IntelligentOCR.Activities package in the same project with the UiPath.PDF.Activities package, you need to use either version 2.x of both, or versions 3.x of both.
UiPath.IntelligentOCR.Activities version 3.0 and higher is incompatible with a UiPath.PDF.Activities version lower than 3.0, and a UiPath.PDF.Activities version 3.0 or higher is incompatible with an UiPath.IntelligentOCR.Activities version lower than 3.0.
The IntelligentOCR package is compatible with any custom classification or data extraction activity that is built based on the public package UiPath.DocumentProcessing.Contracts. It offers full flexibility to build your own algorithm specific to your use case, as well as integrating it with any third-party solution for document classification and data extraction.
ABBYY FlexiCapture Engine SDK is required if you want to use the FlexiCapture Classifier or FlexiCapture Extractor activities. The engine only works with a license distributed by the UiPath sales department. To request a license, access the Contact us page, go to Technical Support & Activations, fill the form and choose Service Request after providing a Name and Email.