Document Understanding
latest
false
  • Release Notes
Banner background image
Document Understanding Release Notes
Last updated May 7, 2024

March 2022

General Release Notes - ML Packages

14 March 2022 | V22.1.6

DocumentUnderstanding + DocumentClassifier + Data Extraction ML Packages Released in AI Center Cloud, package version 22.1.6

Bug Fixes

  • Fixed a bug that was causing a training pipeline or a full pipeline in AI Center to fail due to an ML Package issue in data pre-processing for an empty line.

7 March 2022 | V22.2.3

UiPathDocumentOCR - Released in AI Center Cloud, package version 22.2.3

Superior capability

Integrated HandwritingRecognitionOCR into UiPathDocumentOCR. In many cases, there is a mix of fields. By integrating the handwriting reading capability, we are able to apply the correct recognition to each field: print recognition to print text, and handwriting recognition to handwritten text.

Altough HandwritingRecognitionOCR can detect any handwriting, please know that it is trained and optimized only for English.

Improvements

Increased word count limit from 1600 to 10000 per page.

Added the following scientific symbols: μ, , , <, >.

2 March 2022 | V22.1.4

DocumentUnderstanding + DocumentClassifier + Data Extraction ML Packages Released in AI Center Cloud, package version 22.1.4

What's new

The Utility Bills ML Package is now generally available.

Improvements

Overall improved performance and scalability.

Significant improvements on scores when training on the new version of the DocumentUnderstanding ML Package as compared to previous versions.

Dates in column fields are now parsed correctly.

Date parsing now recognizes Turkish month names.

Changes

Changed the behavior for Training Pipelines and Full Pipelines when training on GPU versus on CPU. The 21.10.x models trained on CPUs were smaller, so they trained faster than the previous versions, while having slightly lower accuracy than before.

This behavior has been reversed with this release, so the model being trained on GPU and on CPU is the exact same model, and the training speed has reverted to what it was before 2021.10, which means training on CPU is again 10-20X slower than on GPU.

General Release Notes - Document Understanding

29 March 2022

Improvements

Added more descriptive tooltips on Training,Validation, and Evaluation document types.

Bug Fixes

  • Fixed a known issue that was causing the search or the download of a document which contained characters that require URL encoding (&, ,, +, #, ') in its file name to fail with invalid query.
  • Fixed a bug that caused the Predict functionality to fail on documents with very dense text.

7 March 2022

Improvements

Implemented inside document search which allows you to search for instances of text solely in your current document. This is particularly helpful for documents with many pages. The search bar can be found at the bottom left hand side of the screen and it can also be accessed using the shortcut Ctrl + Shift + F.

When using the Predict functionality, manually labeled data is deleted and the document is overwritten with the new values from the model.

The split.csv is no longer used when importing a dataset into another Document Manager session, or when running a Training Pipeline. The data from the file is now integrated in the JSON files from the latest folder in the dataset, more exactly in the subset field. So, if you manually modify the file or delete it completely from the dataset, it does not have an impact over the training of the model. Please know, however, that the file is still kept for document level export in the case of ML Packages version 21.10 or before.

Added the option to permanently delete individual files. This can be found in the drop-down next to the document name, alongside the download option.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.