document-understanding

latest

false

Document Understanding Release Notes
ML Packages and Public Endpoints Release Notes
- General ML packages and public endpoints updates
- ML packages and public endpoints version history

Document Understanding Release Notes

DELIVERY:

Automation Cloud Automation Cloud Public Sector Automation Suite Standalone

Last updated Dec 12, 2024

ML packages and public endpoints version history

v24.11.3

UiPath Document Understanding OCR

Release date: 27 November 2024

Released in UiPath Document Understanding OCR and endpoints | v24.11.3

Improvements

In this release, we have enhanced the accuracy and performance for various text types. This includes text printed on very large or low-resolution images, as well as handwritten text.

Recognition for checkboxes, especially those represented by fully blackened squares or rectangles, is significantly improved. Additionally, we have also fine-tuned signature detection.

v24.9.1

UiPath Document Understanding OCR

Release date: 3 October 2024

Released in UiPath Document Understanding OCR and endpoints | v24.9.1

Improvements

This release brings accuracy and performance improvements for handwriting recognition.

v24.7

UiPathDocumentOCR

Release date: 23 July 2024

Released in UiPath Document Understanding OCR and endpoints (including UiPath Document Understanding OCR_CPU) | v24.7

Improvements

The accuracy for the Azerbaijani language is improved by adding recognition for the əƏ characters.
The recognition and detection for Magnetic Ink Character Recognition (MIRC) is improved, bringing enhanced accuracy especially for checks.
Previously, numbers were not recognized in some instances when a space was used as separator. This is now improved and numbers are now recognized when space is used as separator.

Bug fixes

The confidence score for the UiPath Document Understanding OCR is improved, particularly when used on lower quality images. In workflows where confidence score is used to decide if documents need human validation in Action Center, this improvement may result in an increased number of documents undergoing validation.

v24.4.4

Data Extraction

Release date: 3 October 2024

Released in Data Extraction ML packages | v24.4.4

Bug fixes

We've fixed an issue that was causing AI Center training pipelines to report inaccurately high scores for ID Number and Phone Number field types. This ensures that the reported scores match the actual scores.
We've corrected an issue that was related to parsing values on Japanese fields when the Extended Languages OCR was in use.

v24.4.3

DocumentUnderstanding and Data Extraction

Release date: 14 August 2024

Released in endpoints + DocumentUnderstanding + Data Extraction ML packages | v24.4.3

Improvements

Improved field text formatting for Chinese, Japanese, and Korean languages when using the UiPath® Extended Languages OCR in the digitization step.

v24.4.2

InvoicesIndia and endpoints

Release date: 23 July 2024

Released in endpoints and InvoicesIndia ML package | v24.4.2

Bug fixes

We fixed an issue related to number parsing in Indian invoices.

v24.4.1

DocumentUnderstanding, InvoicesJapan, and endpoints

Release date: 20 June 2024

Released in endpoints + DocumentUnderstanding + InvoicesJapan ML package | v24.4.1

Bug fixes

We fixed an issue related to dates in column fields specifically for the Japanese language.

v24.4.0

DocumentClassifier and Data Extraction

Release date: 24 May 2024

Released in:

DocumentUnderstanding + Data Extraction ML packages | v24.4.0
DocumentClassifier ML packages | v24.4.0

What's new

The following new ML packages are now in Public Preview:

Improvements

This release also brings improvements for several other ML packages:

Accuracy for the Invoices Japan ML package is improved. There are also 11 new fields for the Invoices Japan model. For the complete list of extracted fields, check the Out-of-the-box models details file.
The performance for the Payslips model is improved.
New IDs are available for the ID Cards ML package:
- Aadhaar ID cards
- Saudi Arabian ID cards
- PAN cards
New fields are available for the UB04 ML package. For the complete list of extracted fields, check the Out-of-the-box models details file.
New fields are available for the Checks ML package. For the complete list of extracted fields, check the Out-of-the-box models details file.

Erratum - added 20 June 2024: Added information regarding a bug fix related to the parsing of Japanese dates.

Erratum - added 28 May 2024: Added more information on several improvements.

v24.3.2

DocumentUnderstandingOCR endpoints

Release date: 13 March 2024

Released in DocumentUnderstandingOCR Endpoints | v24.3.2

A new version for the Document Understanding OCR is now available for general usage.

This release brings the following improvements:

The accuracy for Turkish (TUR) is improved. The performance for characters with diacritics (such as Ç, ç, Ğ, ğ, I, ı, İ, i, Ş, ş, Ö, ö, Ü, ü) is improved.
The accuracy for Eastern-Arabic numerals (٠, ١, ٢, ٣, ٤, ٥, ٦, ٧, ٨, ٩) is improved.

v24.2.1

DocumentUnderstandingOCR endpoints

Release date: 9 February 2024

Released in DocumentUnderstandingOCR Endpoints | v24.2.1

We are excited to announce that Arabic (ARA) support for UiPath Document Understanding OCR is now in public preview.

v24.2.0

Data Extraction

Release date: 1 April 2024

Released in Data Extraction ML Packages | v24.2.0

This release brings support for the new models available in public preview:

1040 Schedule C
1040 Schedule D
1040 Schedule E
UB04

Document Classifier

Release date: 4 March 2024

Released in DocumentClassifier ML Packages | v24.2.0

This release brings support for the new models available in public preview:

1040 Schedule C
1040 Schedule D
1040 Schedule E
UB04

v23.10.5

UiPath Document Understanding OCR

Release date: 15 October 2024

Released in UiPath Document Understanding OCR and endpoints | v23.10.5

Improvements

This release brings accuracy and performance improvements for handwriting recognition.

Bug fixes

We've fixed an issue where annotation boxes were returned horizontally, even though some documents were slightly skewed, causing misalignment in the annotation.

v23.10.4

Data Extraction

Release date: 28 March 2024

Released in Data Extraction ML packages | v23.10.4

A new version for the out-of-the-box pre-trained ML packages is now available for general usage.

This release brings the following improvements:

The accuracy for Turkish (TUR) is improved. The performance for characters with diacritics (such as Ç, ç, Ğ, ğ, I, ı, İ, i, Ş, ş, Ö, ö, Ü, ü) is improved.
The accuracy for Eastern-Arabic numerals (٠, ١, ٢, ٣, ٤, ٥, ٦, ٧, ٨, ٩) is improved.
The accuracy for datasets smaller than 400 pages is improved.

v23.10.3

DocumentUnderstanding, Data Extraction, and endpoints

Release date: 12 February 2024

Released in Endpoints + DocumentUnderstanding + Data Extraction ML Packages | v23.10.3

A new version for all out-of-the-box pre-trained ML packages part of AI Center is now available for general usage.

This new version brings a bug fix related to the extraction of bidirectional (left-to-right and right-to-left) text values.

Note: Currently, our platform does not have localization for right-to-left languages (such as Hebrew or Arabic). As a result, when combined with punctuation marks or special characters, text in those languages which appears in annotation interface (Document Manager) or validation interface (Validation Station in Action Center) is not displayed correctly. However, if the values of the strings are entered into an application which has right-to-left reading mode enabled, the text should be displayed correctly. A typical example is Notepad, where right-to-left reading order is enabled.

v23.10.2

DocumentUnderstanding and Data Extraction

Release date: 23 January 2024

Released in DocumentUnderstanding + Data Extraction ML packages | v23.10.2

A new version for all out-of-the-box pre-trained ML packages is now available for general usage.

This release brings a bug fix that occasionally caused training to fail.

v23.10.0

DocumentUnderstanding, Data Extraction, and endpoints

Release date: 26 October 2023

Released in Endpoints + DocumentUnderstanding + Data Extraction ML packages | v23.10.0

A new version for all out-of-the-box pre-trained ML packages is now available for general usage.

We are constantly working to improve your Document Understanding experience. For this release, we made sure to bring minor security and stability improvements to our product.

UiPath Document Understanding OCR

Release date: 2 October 2023

Released in UiPath Document Understanding OCR | v23.10

We are excited to announce that Hebrew (HEB) is now supported by UiPath Document Understanding OCR.

v23.7.0

DocumentUnderstanding and Data Extraction

Release date: 3 August 2023

Released in DocumentUnderstanding + Data Extraction ML packages | v23.7.0

In documents where a table runs across many pages, a table row (a line item) gets split across 2 pages, in some cases even more. The previous model versions assumed that each page break was also a row-break, and it broke items into multiple pieces. The current model version fixes this issue. To benefit from this feature in a workflow, you need to use the DocumentUnderstanding.ML.Activities package version 1.23.0-preview, and the 23.7.0 model version in that particular workflow.
Models now have a faster prediction time per page, and use RAM more efficiently, allowing processing of larger documents.

v23.6.0

DocumentUnderstanding and endpoints

Release date: 13 June 2023

Released in DocumentUnderstanding + endpoints | v23.6.0

We've improved the accuracy of the UiPathDocumentOCR ML package.

v23.4.1

DocumentUnderstanding, Data Extraction, and endpoints

Release date: 23 May 2023

Released in DocumentUnderstanding + Data Extraction ML packages | v23.4.1

We've fixed an issue that was affecting the model training.

v23.4.5

DocumentUnderstanding

Release date: 21 April 2023

Released in DocumentUnderstanding | v23.4.5

We've improved the general typed text model and enhanced the checkbox recognition functionality.

v23.4.2

DocumentUnderstanding

Release date: 24 March 2023

Released in DocumentUnderstanding | v23.4.2

The UiPath Document OCR public endpoint has been updated and now provides handwriting language support for German and French, and print language support for Danish, Finnish, Norwegian, and Swedish. Here's the complete list of the new supported languages: Danish, Swedish, Norwegian, Finnish, Polish, Hungarian, Czech, Slovakian, Estonian, Latvian, Lithuanian, Slovenian, Croatian, Serbian, Turkish.

v23.4.0

DocumentUnderstanding, Data Extraction, and endpoints

Release date: 10 May 2023

Released in DocumentUnderstanding + Data Extraction ML packages | v23.4.0

The UiPath Document OCR is now available as an out-of-the-box pre-trained package, and it is available for both GPU and CPU usage. This enables customers who prefer to avoid using public endpoints to deploy UiPath Document OCR in their own tenants, in an isolated environment.

A list of seven new Out-of-the-box pre-trained ML Packages is now available for general usage. Here's the list of the seven new models:

Certificate of incorporation/Good Standing
Certificate of Origin
Children Product Certificate
CMS1500
EU Declaration of Conformity
Invoices Shipping
Pay slips

DocumentClassifier and endpoints

Release date: 26 April 2023

Released in Endpoints + DocumentClassifier ML packages | v23.4.0

We've added new document types to the DocumentClassifier ML Package, made general improvements, and fixed some small bugs.

v23.2.0

DocumentUnderstanding, Data Extraction, and endpoints

Release date: 23 February 2023

Released in Endpoints + DocumentUnderstanding + Data Extraction ML packages | v23.2.0

What's new & improvements

A new version of the out-of-the-box pre-trained ML packages (23.1.0) and their public endpoints has been released, now using cutting edge LayoutLM Transformers based architecture, which is more powerful and increases accuracy overall, especially on column fields (tables).

This improvement has made the out-of-the-box pre-trained ML packages more powerful, meaning that you may experience longer latency for training and for predictions.

For all situations where latency is critical (e.g.: attended scenarios) we recommend deploying the models as ML Skills using a GPU.

We have improved how the scores are calculated after Training/Evaluation/Full pipelines to provide a separate score for each column field. Before this improvement, F1 scores were calculated as a whole, for all column fields taken together.

An upcoming removal is announced for the Manual edits feature used in the model evaluation. More information here.

Known issues

The project import from AI Center is currently disabled. We are actively working on this and expect to have it reenabled by the end of March.

Erratum 8 May 2023

Known issue

A Fatal Python error: Segmentation fault is received when running a Full or Training Pipeline. We recommend using the ML Packages with v23.4 until this bug is fixed.

Erratum 20 April 2023

Overall score for all pipelines is now an Accuracy. Previously it was an F1 score. The evaluation artefacts in AI Center still contain both accuracy and F1 score, for backwards comparability.

v23.1.0

DocumentClassifier and endpoints

Release date: 11 January 2023

Released in Endpoints and DocumentClassifier | v23.1.0

We have improved the F1 scores and they are now also displayed for Training pipelines.

The Artifacts folder has an updated list of artifacts.

The DocumentClassifier model now predicts 25 classes, instead of 26, due to the removal of the Delivery Notes class.

v22.12.2

Endpoints

Release date: 16 December 2022

Released in endpoints | v22.12.2

The UiPath Document OCR public endpoint has been updated and now provides handwriting language support for German and French, and print language support for Danish, Finnish, Norwegian, and Swedish.

v22.11.0

Document Understanding, Data Extraction, and endpoints

Release date: 13 December 2022

Released in endpoints + DocumentUnderstanding + Data Extraction ML packages | v22.11.0

This release brings significant improvements to the public endpoints of the out-of-the-box pre-trained ML packages, meaning that we are now using the latest LayoutLM based Deep Learning architecture.

This improvement provides better accuracy on all document types, especially for the Invoices model, and it also improves the accuracy on column fields and tables.

We added new extracted fields to the Invoices model that now have Shipping Date, Vendor email address, Bank name, Bank account number, IBAN, SWIFT Code, Bank Address, Bank Routing number, and Tax rate. You can check the list of extracted fields by accessing this page and clicking on the link available for each model.

Model scores are now returned by Training pipelines too, not only by Full or Evaluation pipelines.

F1 scores are now available for each column field. Until now, F1 scores were available only for all column fields taken together.

v22.10.2

Endpoints

Release date: 3 February 2023

Released in endpoints | v22.10.2

We've updated the public endpoints of the out-of-the-box pre-trained ML packages, and are now using cutting edge LayoutLM Transformers based architecture.

v22.10.0

DocumentUnderstanding, Data Extraction, and endpoints

Release date: 7 October 2022

Released in endpoints + DocumentUnderstanding + Data Extraction ML packages | v22.10.0

What's new & improvements

The following pretrained models are now listed as official, without the -Preview tag: InvoicesAustralia, InvoicesIndia, PurchaseOrders.

The DeliveryNotes model has been renamed as BillsOfLading.

Ten new pretrained models are now available: Acord25, 1040, Checks, Bank Statements, Financial statements, Packing Lists, Acord131, Acord126, Acord140, Vehicle Titles.

Bug fixes

Several bug fixes have been made to the above mentioned packages.

UiPath DocumentOCR

Release date: 4 October 2022

Released in UiPathDocumentOCR | v22.10.0 Cloud

A new feature is now available for barcodes and QR codes detection.

Accuracy improvements have been made on long strings like email addresses and URLs, on fixed width fonts, and on handwriting and signatures detection.

Page rotation detection has also been improved.

v22.6.1-preview

DocumentUnderstanding, Data Extraction, and endpoints

Release date: 10 October 2022

Released in endpoints + DocumentUnderstanding + Data Extraction ML packages | v22.6.1-preview

This release brings several bug fixes to the DocumentUnderstanding and Data Extraction packages and endpoints.

v22.6.0-preview

DocumentUnderstanding and Data Extraction

Release date: 6 September 2022

Released in DocumentUnderstanding + Data Extraction ML packages | v22.6.0-preview

There are 18 new Preview ML packages available with a more advanced model architecture for our DU ML Packages in AI Center. You can easily identify them by the Preview attached to the end of the package name, eg.: InvoicesPreview,PurchaseOrderPreview,Acord125Preview, etc.

We've updated the public endpoints list with all the new Preview ML packages and can be consulted Public Endpoints.

Worth mentioning is the fact that these preview models don't consume DU/AI units from your licensing entitlement.

Fixed a bug on private skills usage and now the private skill can be used only with an API key that belongs to the same organization that is using the AI Center instance.

v22.5.2

DocumentUnderstanding and Data Extraction

Release date: 22 July 2022

Released in DocumentUnderstanding + Data Extraction ML packages | v22.5.2

Bug fixes

This hotfix stabilizes the items splitting by combining the

eol
                           classifier

and line_detection methods into a single method.

Known issue

There is a known issue for the Invoices package that ocassionally leads to an error when trying to run an auto-fine-tunning loop in AI Center.

v22.5.1

DocumentUnderstanding, DocumentClassifier, and Data Extraction

Release date: 18 July 2022

Released in DocumentUnderstanding + DocumentClassifier + Data Extraction ML packages | v22.5.1

Bug fixes

Fixed a bug that was causing the extracted fields to be shown on the wrong page in Validation Station.
Fixed a bug that was causing the last line of text on some pages to not be digitized in Document Manager.
Fixed a bug that was preventing displaying some F1 score items from the evaluation_F1_invoices.txt file in Full/Evaluation pipelines in AI Center.
Fixed a bug that was causing the wrong overall F1 score to be calculated in evaluation_F1_invoices.txt file in Full/Evaluation pipelines in AI Center whenever a model had only column fields.

v22.5.0

AI Center cloud, Data Extraction

Release date: 16 June 2022

Released in AI Center Cloud, Data Extraction ML packages | v22.5.0

Improvements

Performance has been improved for all Data Extraction ML packages.

v22.4.3

DocumentUnderstanding and Data Extraction

Release date: 21 July 2022

Released in DocumentUnderstanding + Data Extraction ML packages | v22.4.3

This hotfix stabilizes the items splitting by combining the

eol
                           classifier

and line_detection methods into a single method.

v22.4.2

DocumentUnderstanding, DocumentClassifier, and Data Extraction

Release date: 14 July 2022

Released in DocumentUnderstanding + DocumentClassifier + Data Extraction ML packages | v22.4.2

Bug fixes

Fixed a bug that was causing the extracted fields to be shown on the wrong page in Validation Station.
Fixed a bug that was causing the last line of text on some pages to not be digitized in Document Manager.
Fixed a bug that was preventing displaying some F1 score items from the evaluation_F1_invoices.txt file in Full/Evaluation pipelines in AI Center.
Fixed a bug that was causing the wrong overall F1 score to be calculated in evaluation_F1_invoices.txt file in Full/Evaluation pipelines in AI Center whenever a model had only column fields.

v22.4.1

AI Center cloud, Data Extraction

Release date: 3 June 2022

Release date in AI Center Cloud, Data Extraction ML packages | v22.4.1

Bug fixes

Fixed a bug occurring when running an evaluation pipeline on a model trained with the special line_detection mode, causing predictions to be different than when called from the ML Skill.

v22.4.0

DocumentUnderstanding, DocumentClassifier, and Data Extraction

Release date: 10 May 2022

Released in DocumentUnderstanding + DocumentClassifier + Data Extraction ML packages

| v22.4.0

What's new

Handwriting capabilities are now available for the UiPathDocumentOCR and the UiPathDocumentOCR_CPU packages, by integrating the HandwritingRecognitionOCR. The same capabilities can be found in the UiPath.OCR.LocalServer Studio package.

New architecture on extraction ML packages, with major benefits, especially to models trained using the DocumentUnderstanding ML package.

Utility Bills, W9, and Passports ML Packages are now available as GA. Five new out-of-the-box pre-trained ML packages are now available in -Preview to ease your work.

Five new out-of-the-box pre-trained ML packages are now available in -Preview to ease your work.

Document Search is a new feature available in Document Manager facilitating labelling documents with a high number of pages.

Improvements

Improvements have been made to the ML packages for document extraction in AI Center. The Evaluation Excel spreadsheet has received new sheets, allowing you to better organize and interpret the evaluated data.

ML Packages in Automation Suite offline installation have received a new offline bundle.

Accuracy and performance have been improved for the UiPathDocumentOCR.

Bug fixes

Multiple fixes on parsing date fields, including dates in Column fields, dates in Turkish documents, dates far into the future

v22.2.3

UiPathDocumentUnderstandingOCR

Release date: 7 March 2022

Released in UiPathDocumentOCR | v22.2.3

Superior capability

Integrated HandwritingRecognitionOCR into UiPathDocumentOCR. In many cases, there is a mix of fields. By integrating the handwriting reading capability, we are able to apply the correct recognition to each field: print recognition to print text, and handwriting recognition to handwritten text.

Altough HandwritingRecognitionOCR can detect any handwriting, please know that it is trained and optimized only for English.

v22.1.6

DocumentUnderstanding, DocumentClassifier, and Data Extraction

Release date: 14 March 2022

Released in DocumentUnderstanding + DocumentClassifier + Data Extraction ML packages

| v22.1.6

Bug fixes

Fixed a bug that was causing a training pipeline or a full pipeline in AI Center to fail due to an ML package issue in data pre-processing for an empty line.

v22.1.4

DocumentUnderstanding, DocumentClassifier, and Data Extraction

Release date: 2 March 2022

Released in DocumentUnderstanding + DocumentClassifier + Data Extraction ML packages | v22.1.4

What's new

The Utility Bills ML package is now generally available.

Improvements

Overall improved performance and scalability.

Significant improvements on scores when training on the new version of the DocumentUnderstanding ML package as compared to previous versions.

Dates in column fields are now parsed correctly.

Date parsing now recognizes Turkish month names.

Changes

Changed the behavior for Training Pipelines and Full Pipelines when training on GPU versus on CPU. The 21.10.x models trained on CPUs were smaller, so they trained faster than the previous versions, while having slightly lower accuracy than before.

This behavior has been reversed with this release, so the model being trained on GPU and on CPU is the exact same model, and the training speed has reverted to what it was before 2021.10, which means training on CPU is again 10-20X slower than on GPU.

v21.10.11

Data Extraction

Release date: 23 November 2021

Released in Data Extraction ML packages | v21.10.11

Fixed a bug that was causing the Training and Evaluation Pipelines to fail due to date post-processing logic.

v21.10.9

Data Extraction

Release date: 24 November 2021

Released in Data Extraction ML packages | v21.10.9

Fixed a bug that was throwing a prediction error at runtime.

Data Extraction and endpoints

Release date: 22 October 2021

Released in Data Extraction ML packages and endpoints | v21.10.9

What's new

The PurchaseOrders ML package is now Generally Available and it is ready to be used in your production scenarios.

InvoicesChina, DeliveryNotes, RemittanceAdvices, W2, and W9 ML packages are now in Public Preview. We recommend you check out these packages and start using them for the type of documents you need to process.

Improvements

Implemented document level evaluation. This is representative for the runtime performance in your RPA workflow.

Evaluation can also be done on datasets with fewer fields than the ML Package being evaluated. This facilitates evaluation on out-of-the-box pre-trained ML Packages.

To assess the impact OCR has on extraction accuracy, you can now rerun it when running an Evaluation Pipeline. This requires OCR to be configured when creating an ML Package and the Environment Variable eval.redo_ocr needs to be set to true in the AI Center Evaluation Pipeline.

Training on CPU now uses a smaller model to obtain a 5x-7x speedup. However, you should expect a lower accuracy by 0-5% on CPU.

Added Minimum Confidence and Straight Through Processing Rate columns to the Evaluation.xlsx files produced by Evaluation Pipelines.

The UtilityBills ML Package has been substantially improved.

Address parsing improvement for addresses which skip 1-2 lines of text.

Improvement on extracting negative values, very large values (11 digits or more), or dates far into the future.

Added support for rotated boxes on receipts.

Concatenated spans enhancement.

Bug fixes

Fixed a bug that was not returning special characters in String type fields.
Fixed a bug for the Passports ML Package where the date written as an ordinal number (1st, 2nd, 3rd, 4th, etc.) was not parsed correctly.

Known issues

Retraining InvoicesJapan and InvoicesChina ML Packages using data from Validation Station is currently not supported. As a workaround, please use Google Cloud Vision OCR.

Upcoming deprecations

All public endpoints, except for UiPathDocumentOCR, FormExtractor, IntelligentFormExtractor, and IntelligentKeywordClassifier, are going to be deprecated for non-West Europe regions starting with December 1, 2021.

v21.10.5

UiPathDocumentOCR endpoints

Release date: 13 December 2021

Released in UiPathDocumentOCR endpoints | v21.10.5

Improvements

UiPathDocumentOCR is now also available in the Singapore region.

Public Endpoints

v21.10.1

Data Extraction and endpoints for UiPathDocumentOCR

Release date: 24 September 2021

Released in Data Extraction and endpoints for UiPathDocumentOCR | v21.10.1

Improvements

Added support for rotated text, even if the rotation is at different angles for each word.

Added support for vertical text. This improvement is available at the moment only for UiPath.IntelligentOCR.Activities, including Validation Station.Data Manager and Machine Learning Extractor do not support vertical text yet.

Accuracy improvement on noisy images or photos: for example, Receipts, ID Cards, or Passports.

v21.10

FormExtractor, IntelligentFormExtractor, and IntelligentKeywordClassifier endpoints

Release date: 13 December 2021

Released FormExtractor + IntelligentFormExtractor + IntelligentKeywordClassifier in Endpoints | v21.10

Improvements

Form Extractor, Intelligent Form Extractor, and Intelligent Keyword Classifier are now also available in the Singapore region.

Public Endpoints

v21.7

Data Extraction and endpoints for Handwriting Recognition

Release date: 11 August 2021

Released in Data Extraction and endpoints for Handwriting Recognition | v21.7

Improvements

Ability to deal with multiple shreds in a single call to the model.

Model retraining and a few other changes for better model accuracy.

Bug fixes

Fixed a bug that caused the pod to restart when there was no memory left.

v21.6.3

UiPathDocumentOCR in endpoints

Release date: 9 June 2021

Released in endpoints for UiPathDocumentOCR | v21.6.3

Improvements

Improved single-digit detection.

Improved accuracy on 1, I and l characters.

Improved detection of text when close together.

v21.5.5

Data Extraction and endpoints

Release date: 18 June 2021

Released in endpoints and Data Extraction ML packages | v21.5.5

Fixed a bug that caused prediction differences between Data Manager and the Digitize Document activity.

v21.5.3

Data Extraction and endpoints

Release date: 8 June 2021

Released in endpoints and Data Extraction ML packages | v21.5.3

What's new

For images hard to read, as in the case of ID Cards and Passports, two new corresponding pre-trained Out Of the Box Packages have been released.

Improvements

Incorporated retrainable classification fields in our pre-trained Out Of the Box Packages.

v21.4.7

Data Extraction and endpoints

Release date: 20 April 2021

Released in endpoints and Data Extraction ML packages | v21.4.7

Improved date parsing for Data Extraction ML packages.

v21.4.5

Data Extraction and endpoints

Release date: 15 April 2021

Released in endpoints and Data Extraction ML packages | v21.4.5

What's new

Deployed all public endpoints in United States Region.

Deployed public endpoints for Form Extractor, Intelligent Form Extractor, and Intelligent Keyword Classifier in Canada and Japan Regions.

v21.4

Data Extraction and endpoints for HandwritingRecognition and DocumentClassifier

Release date: 9 March 2021

Released in Data Extraction ML packages & endpoints for HandwritingRecognition, DocumentClassifier, + Standalone Docker for UiPathDocumentOCR | v21.4

What's new

HandwritingRecognition with improved recognition using spelling corrections and ability to read machine-printed text reaches general availability.

DocumentClassifier reaches general availability as well.

Improvements on UiPathDocumentOCR for:

Radio buttons/checkbox detection
Accuracy on bubble forms
General accuracy

v21.1.8

Data Extraction and endpoints

Release date: 17 February 2021

Released in endpoints and Data Extraction ML packages | v21.1.8

Improvements

Improved accuracy.

InvoicesIndia and InvoicesAustralia are now generally available.

Deployed public endpoints in Australia Region.

Edition argument no longer necessary in endpoint URLs. For example, https://du.uipath.com/ie/invoices will work for both enterprise and community traffic.

v20.11.3

Data Extraction

Release date: 18 December 2020

Released in Data Extraction ML packages | v20.11.3

Improvements

Improvements to CPU training to be faster and require less memory.

Date parsing improvements for non-US documents.

Checkbox recognition for UiPathDocumentOCR, including printed or handwritten checkboxes.

v20.10.4

Data Extraction and endpoints

Release date: 10 November 2020

Released in endpoints and Data Extraction ML packages | v20.10.4

New features and improvements

A new model for Japanese Invoices.

Evaluation pipelines now return metrics for Classification fields too.

Support for Microsoft Read OCR version 3.

Improvements to date formatting/parsing for detecting day/month/year versus month/day/year formats.

Improvements to decimal point and thousands separators detections for correct number parsing.

Training on CPU is supported in all versions of AI Fabric.

Improved parsing of fields with content-type id-no.

Support for training Classification fields only (no Regular or Column fields).

Increased the maximum number of allowed fields from 32 to 40.

Report confidence levels for Column fields.

Known issues

When creating a UiPath.DocumentUnderstanding.ML.Activities package in AI Center, the package name should not be any python reserved keyword, such as class , break, from, finally, global, None, etc. Note that this list is not exhaustive since the package name is used for class <pkg-name> and

import
                           <pkg-name>

On this page

v24.11.3
UiPath Document Understanding OCR
v24.9.1
UiPath Document Understanding OCR
v24.7
UiPathDocumentOCR
v24.4.4
Data Extraction
v24.4.3
DocumentUnderstanding and Data Extraction
v24.4.2
InvoicesIndia and endpoints
v24.4.1
DocumentUnderstanding, InvoicesJapan, and endpoints
v24.4.0
DocumentClassifier and Data Extraction
v24.3.2
DocumentUnderstandingOCR endpoints
v24.2.1
DocumentUnderstandingOCR endpoints
v24.2.0
Data Extraction
Document Classifier
v23.10.5
UiPath Document Understanding OCR
v23.10.4
Data Extraction
v23.10.3
DocumentUnderstanding, Data Extraction, and endpoints
v23.10.2
DocumentUnderstanding and Data Extraction
v23.10.0
DocumentUnderstanding, Data Extraction, and endpoints
UiPath Document Understanding OCR
v23.7.0
DocumentUnderstanding and Data Extraction
v23.6.0
DocumentUnderstanding and endpoints
v23.4.1
DocumentUnderstanding, Data Extraction, and endpoints
v23.4.5
DocumentUnderstanding
v23.4.2
DocumentUnderstanding
v23.4.0
DocumentUnderstanding, Data Extraction, and endpoints
DocumentClassifier and endpoints
v23.2.0
DocumentUnderstanding, Data Extraction, and endpoints
v23.1.0
DocumentClassifier and endpoints
v22.12.2
Endpoints
v22.11.0
Document Understanding, Data Extraction, and endpoints
v22.10.2
Endpoints
v22.10.0
DocumentUnderstanding, Data Extraction, and endpoints
UiPath DocumentOCR
v22.6.1-preview
DocumentUnderstanding, Data Extraction, and endpoints
v22.6.0-preview
DocumentUnderstanding and Data Extraction
v22.5.2
DocumentUnderstanding and Data Extraction
v22.5.1
DocumentUnderstanding, DocumentClassifier, and Data Extraction
v22.5.0
AI Center cloud, Data Extraction
v22.4.3
DocumentUnderstanding and Data Extraction
v22.4.2
DocumentUnderstanding, DocumentClassifier, and Data Extraction
v22.4.1
AI Center cloud, Data Extraction
v22.4.0
DocumentUnderstanding, DocumentClassifier, and Data Extraction
v22.2.3
UiPathDocumentUnderstandingOCR
v22.1.6
DocumentUnderstanding, DocumentClassifier, and Data Extraction
v22.1.4
DocumentUnderstanding, DocumentClassifier, and Data Extraction
v21.10.11
Data Extraction
v21.10.9
Data Extraction
Data Extraction and endpoints
v21.10.5
UiPathDocumentOCR endpoints
v21.10.1
Data Extraction and endpoints for UiPathDocumentOCR
v21.10
FormExtractor, IntelligentFormExtractor, and IntelligentKeywordClassifier endpoints
v21.7
Data Extraction and endpoints for Handwriting Recognition
v21.6.3
UiPathDocumentOCR in endpoints
v21.5.5
Data Extraction and endpoints
v21.5.3
Data Extraction and endpoints
v21.4.7
Data Extraction and endpoints
v21.4.5
Data Extraction and endpoints
v21.4
Data Extraction and endpoints for HandwritingRecognition and DocumentClassifier
v21.1.8
Data Extraction and endpoints
v20.11.3
Data Extraction
v20.10.4
Data Extraction and endpoints

Was this page helpful?

PREVIOUSGeneral ML packages and public endpoints updates

Support and Services

Get The Help You Need

UiPath Academy

Learning RPA - Automation Courses

UiPath Forum

UiPath Community Forum

Trust and Security

Cookies Policy