UiPath Release Notes

UiPath.IntelligentOCR.Activities

v3.1.0

Release Date: 8th November 2019

New Features and Improvements

New exciting features and improvements are brought to you with this release.

A new activity meant to help you better organize and manage your trainable classifiers is available: Keyword Based Classifier Trainer. This activity can be used only together with the Train Classifiers Scope activity.

The Validation Station wizard received an important upgrade and is now available for you to explore its maximum potential. This wizard becomes available only when the Present Validation Station activity is used in a workflow. You can use the upgraded version for benefiting from a new user-friendly interface, navigating through the document while using the keyboard shortcuts or selecting one or multiple words or a custom area. You can easily mark a field as missing, extract new data, edit a table or extract a new table. All these marvelous things can be done with the Validation Station wizard while using a dark theme.

One of the improvements included in this release is that the Keyword Based Classifier activity received a new parameter named LearningData. Besides specifying where the learning file data are located, you can now also use the string containing the serialized classifier data. This activity was enhanced with a wizard named Manage Keyword Based Classifier Learning that can be used for configuring and managing the keywords used for identifying specific document types.

Both the Keyword Based Classifier and Keyword Based Classifier Trainer activities are now able to manage multiple keywords. After the keyword sets are selected, the extraction is based on a full match of the selected words.

Another great improvement is that the DocumentObjectModel output, included in the Digitize Document activity, can now support word polygons, besides word horizontal boxes.

The Taxonomy Manager wizard received a new scrolling bar that incorporates all UI elements and it provides a better user experience.

Data Extraction Scope, Train Extractors Scope, Train Classifier Scope and Classify Document Scope activities are now arranging their extractors and classifiers in a horizontal order, replacing the previous vertical order.

The Regex Based Extractor activity has been improved and can now process and return multi-values. The output is visible only when the activity is used together with the Validation Station.

Four new languages, Turkish (TR), Portuguese (PT), Spanish (ES), and Spanish-Mexico (ES-MX) are available for the UiPath.IntelligentOCR.Activities package.

Known Issues

  • Taxonomy Manager can be accessed only if you previously opened a .xaml file. If no files are opened when you access the Taxonomy Manager, a recording window is shown and Taxonomy Manager is displayed only after closing the recording window.

Bug Fixes

  • An exception was thrown when using the Data Extraction Scope activity together with a Try Catch activity. The issue was fixed and now the activity is executed as expected.
  • When a Boolean field was set to No in Validation Station, the output file should have shown the result as No but instead is showing it as missing. The issue was fixed and now the output file shows the correct result.
  • Fixed incorrect number parsing that occurred when the Data Extraction Scope was trying to parse numbers in documents using a different number format than the document's culture.
  • When using multiple Validation Stations, the order of the derived parts was not respected in the validated results. The issue was fixed and now the results are displaying the derived parts in the same order they were introduced.
  • Differences between the boxes with custom selection occurred when the results of a Validation Station were run through a second Validation Station. The issue was fixed and now there are no differences between boxes with custom selection.
  • When the Digitize Document activity was used together with Microsoft Azure Computer Vision OCR engine, the rotation was not working when HandwritingRecognition parameter was set as True. The issue was fixed and now the information is processed correctly.
  • When using Digitize Document activity, an error occurred when trying to process images with a lot of text. The bug was fixed by improving the scaling process.
  • Fixed an issue that was throwing when trying to train the Keyword Based Classifier activity in the training scope and the extraction was run without a classification reference. The issue was fixed and now the fact that there is no learning information is only logged, not thrown as an error.
  • An error was thrown when using the [FlexiCapture Extractor]https://docs.uipath.com/activities/docs/flexicapture-extractor) activity and the same name was given to both a table column and a field. The issue was fixed and the .fcdot file is now processed as expected.

Note:

If an error mentioning the Docotic.Pdf library is encountered at runtime, then you should upgrade the UiPath.IntelligentOCR.Activities package to version v3.1.0 or higher.

v3.0.0

Release Date: 26th August 2019

New Features and Improvements

This release presents a new activity, RegEx Based Extractor accompanied by a wizard as well. You are now able to easily configure your RegEx expression by using the wizard and to extract specific information from documents.

Three new languages, German (DE), South Korean (KO), and Portuguese (PT-BR) are available for the UiPath.IntelligentOCR.Activities package.

Breaking Changes

  • Major updates related to the internal handling of .pdf files occurred for the Digitize Document activity. This has led to a breaking change for the UiPath.IntelligentOCR activities pack, which caused the version to skip ahead from v2.3.0 to v3.0.0.

Bug Fixes

  • Fixed an issue which caused the Digitize Document activity to throw an error when trying to process a high-resolution image inside a .pdf file. Now, the high-resolution images are processed correctly and digitized.
  • Fixed an issue which caused the Present Validation Station activity to fail loading images with very high resolutions. Now all images, no matter of the resolution size, are processed correctly.
  • Fixed an issue which caused the Digitize Document activity to receive incorrect character coordinates. The character coordinates are now received correctly.
  • Improved integration of the Digitize Document activity with OCR engines, including support for rotation (where supported by the OCR engine) and improved accuracy of word-building.

Note:

If you want to use the UiPath.IntelligentOCR.Activities package in the same project with the UiPath.PDF.Activities package, you need to use either version 2.x of both, or versions 3.x of both.
UiPath.IntelligentOCR.Activities version 3.0 and higher is incompatible with a UiPath.PDF.Activities version lower than 3.0, and a UiPath.PDF.Activities version 3.0 or higher is incompatible with an UiPath.IntelligentOCR.Activities version lower than 3.0.

v2.3.0

Release Date: 16th July 2019

New Features and Improvements

We have improved your experience with the Taxonomy Manager wizard. You can now edit the name you gave to any of your Groups or Categories upon creating them.

Bug Fixes

  • Fixed an issue that was populating the headers of tables reported by the Data Extraction Scope activity with the names of the extractors. Now, the headers are populated only with the custom names from the Taxonomy.

v2.0.1

Release Date: 26th June 2019

New Features and Improvements

We want to reach out to the entire world and make automation a language everyone can speak. So, starting with this release, the entire platform is available in Chinese

Note:

Chinese can only be used in this pack when installed in Studio v2019.4.4 or v2019.7 or above.

v2.2.0

Release Date: 24th June 2019

New Features and Improvements

This release comes with new activities accompanied by setting wizards such as the Train Extractors Scope and Train Classifiers Scope activity.

The Train Extractors and Classifiers activity has been deprecated starting with v2.2.0. and it is now replaced by the two newly added activities, the Train Extractors Scope and Train Classifiers Scope.

The Data Extraction Scope now has a new check box, FormatValuesIfPossible, available in the Properties field.

The Taxonomy Manager option has received a new Close button that is visible and accessible through all the setting wizard.

v2.1.0

Release Date: 21st May 2019

New Features and Improvements

We have improved error messages throughout the entire activity pack, so you can solve issues faster and with less hassle.

v2.0.0

Release Date: 25th April 2019

New Features and Improvements

As you're probably used to by now, month after month we draw closer to our final goal of creating the ultimate document processing platform. Alongside the first enterprise release of this year, the IntelligentOCR activities pack has been imbued with some new activities, as follows:

The UiPath.DocumentProcessing.Contracts pack enables you to implement your own extractor and classifier activities by simply referencing it. This assembly contains all the classification and extraction interfaces that underlie the IntelligentOCR activities.

The Taxonomy Manager now displays the Document Type ID of the document type that is being edited.

Breaking Changes

While migrating to the public UiPath.DocumentProcessing.Contracts, the IntelligentOCR v2.0.0 activity pack introduces breaking changes for the Classify Document Scope and Train Classifiers And Extractors activities.

Known Issues

  • Opening the Data Extraction Wizard throws an error when the Data Extraction Scope activity or a parent activity are commented out.

Bug Fixes

  • Fixed an issue which caused the Present Validation Station activity to throw an exception when processing certain .pdf files.
  • Digitize Document was unable to detect check boxes in certain documents.

v1.6.1

Release Date: 22nd February 2019

Bug Fixes

  • Fixed an issue which caused the Process Document activity to throw an error when processing large PDF files.

v1.6.0

Release Date: 20th February 2019

New Features and Improvements

The Taxonomy Manager is the next piece of the document processing puzzle, a wizard created to help you build custom taxonomy files which can then be reused across processes.

We have developed the Load Taxonomy activity, which grants you the ability to load a taxonomy created with the aid of the Taxonomy Manager wizard into a variable which can then be passed on to other activities.

The DegreeOfParallelism property has been added to the Digitize Document activity, enabling you to perform OCR analysis on multiple pages simultaneously. This is not a breaking change, so old workflows still function properly after updating to the latest version of the pack.

The IntelligentOCR pack is now upgraded to .NET Framework v4.6.1.

The MatchingDocumentDefinition property of the FCDocument variable has been exposed. Assigning it to a variable generates the same result as a Classify Document activity.

Known Issues

  • The Tesseract OCR engine fails to properly read images with black borders.

v1.5.0

Release Date: 18th February 2019

New Features and Improvements

The IntelligentOCR pack has been upgraded with some new activities that regard document classification. These activities are:

Bug Fixes

  • Fixed an issue which caused the Process Document activity to crash when processing documents that contained check boxes.
  • Certain types of .pdf files caused the Digitize Document activity to throw errors.
  • Certain types of .jpg files caused the Digitize Document activity to throw errors.
  • In certain circumstances, editing a table with confidence below 100% and making no changes to it modified the confidence to 100%.
  • Fixed an issue which caused the Extract manual token as reference for this field button to remain disabled.

v1.4.0

Release Date: 21st January 2019

Improvements

The Digitize Document activity has been improved performance-wise with some backend changes.

Bug Fixes

Fixed an issue which caused certain UI elements to flicker in the Validation Station wizard.

The OperatorConfirmed flag in the ExtractionResults JSON file remained False regardless of whether an user had confirmed the extraction results or not.

In certain cases, the Prepare Validation Station Data activity could not read document information from FCDocument variables.

v1.3.0

Release Date: 10th January 2019

This new year brings two more languages in the entire UiPath Platform - French and Russian. Since we layed down the foundations of localization in our previous release, we are continuing our efforts in bringing you a more immersive experience and lowering the language barrier bit by bit.

v1.2.1

Release Date: 14th December 2018

Bug Fixes

  • Fixed an issue which caused the FlexiCapture engine to always return a confidence score of 100.

v1.2.0

Release Date: 12th December 2018

The IntelligentOCR package has received a major update, as we've developed three new activities that enable you to approach Document Processing in a much simpler manner. The new activities are:

  • Present Validation Station - offers attended users the ability to make real-time CRUD (Create, Read, Update, and Delete) operations on documents for the purpose of classification and human data validation and extraction.
  • Prepare Validation Station Data - creates a bridge between FlexiCapture's Process Document activity and the new Validation Station, ensuring a much more user-friendly data validation experience.
  • Digitize Document - provides a new way of generating text versions from incoming documents, being able to process any PDF and most image formats.

v1.1.6855.16979

Release Date: 8th October 2018

The moment is finally here - the entire UiPath Platform has been localized, so that you can have a truly immersive experience, from install to design and execution. Now, besides English, you can access everything, including our online documentation, in Japanese.

v1.0.6725.18428

Release Date: 4th June 2018

To step up on our OCR game, coming to the aid of your digitization efforts, we have integrated the capabilities of the ABBYY FlexiCapture SDK into the new UiPath.IntelligentOCR.Activities pack, which contains the following:

  • IntelligentOCR Scope - Initializes the ABBYY FlexiCapture engine and provides a scope for all IntelligentOCR activities.
  • Process Document - Processes a document with the FlexiCapture engine and converts it to an FCDocument variable which can be used in other activities.
  • Classify Document - Enables you to classify a given document based on an ABBYY classifier file and one or more templates.
  • Export Document - Exports FlexiCapture documents to one of the .csv, .xml, .xls or .json formats.
  • Get Field - Retrieves a specified field from an FCDocument variable and returns it as an FCField variable.
  • Get Table - Retrieves a specified table from an FCDocument variable and returns it as an DataTable variable.
  • Validate Document - Validates a processed document contained in a FCDocument variable by using the ABBYY SDK and returns it in the same format.

UiPath.IntelligentOCR.Activities


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.