- Release Notes
March 2022
DocumentUnderstanding + DocumentClassifier + Data Extraction ML Packages Released in AI Center Cloud, package version 22.1.6
Bug Fixes
- Fixed a bug that was causing a training pipeline or a full pipeline in AI Center to fail due to an ML Package issue in data pre-processing for an empty line.
UiPathDocumentOCR - Released in AI Center Cloud, package version 22.2.3
Superior capability
Integrated HandwritingRecognitionOCR into UiPathDocumentOCR. In many cases, there is a mix of fields. By integrating the handwriting reading capability, we are able to apply the correct recognition to each field: print recognition to print text, and handwriting recognition to handwritten text.
Altough HandwritingRecognitionOCR can detect any handwriting, please know that it is trained and optimized only for English.
Improvements
Increased word count limit from 1600 to 10000 per page.
μ
, ≤
, ≥
, <
, >
.
DocumentUnderstanding + DocumentClassifier + Data Extraction ML Packages Released in AI Center Cloud, package version 22.1.4
What's new
The Utility Bills ML Package is now generally available.
Improvements
Overall improved performance and scalability.
Significant improvements on scores when training on the new version of the DocumentUnderstanding ML Package as compared to previous versions.
Dates in column fields are now parsed correctly.
Date parsing now recognizes Turkish month names.
Changes
Changed the behavior for Training Pipelines and Full Pipelines when training on GPU versus on CPU. The 21.10.x models trained on CPUs were smaller, so they trained faster than the previous versions, while having slightly lower accuracy than before.
This behavior has been reversed with this release, so the model being trained on GPU and on CPU is the exact same model, and the training speed has reverted to what it was before 2021.10, which means training on CPU is again 10-20X slower than on GPU.
Improvements
Added more descriptive tooltips on Training,Validation, and Evaluation document types.
Bug Fixes
- Fixed a known issue that was causing the search or the download of a document which contained characters that require URL
encoding (
&
,,
,+
,#
,'
) in its file name to fail with invalid query. - Fixed a bug that caused the Predict functionality to fail on documents with very dense text.
Improvements
Ctrl
+ Shift
+ F
.
When using the Predict functionality, manually labeled data is deleted and the document is overwritten with the new values from the model.
split.csv
is no longer used when importing a dataset into another Document Manager session, or when running a Training Pipeline. The
data from the file is now integrated in the JSON files from the latest folder in the dataset, more exactly in the subset field. So, if you manually modify the file or delete it completely from the dataset, it does not have an impact over the
training of the model. Please know, however, that the file is still kept for document level export in the case of ML Packages
version 21.10 or before.
Added the option to permanently delete individual files. This can be found in the drop-down next to the document name, alongside the download option.