Subscribe

UiPath Document Understanding

UiPath Document Understanding

DocumentUnderstanding + Data Extraction ML Packages

Release date in AI Center Cloud & Endpoints: 22 October 2021

What's New

The PurchaseOrders ML Package is now Generally Available and it is ready to be used in your production scenarios.

InvoicesChina, DeliveryNotes, RemittanceAdvices, W2, and W9 ML Packages are now in Public Preview. We recommend you check out these packages and start using them for the type of documents you need to process.

Improvements

Implemented document level evaluation. This is representative for the runtime performance in your RPA workflow.

Evaluation can also be done on datasets with fewer fields than the ML Package being evaluated. This facilitates evaluation on out-of-the-box pre-trained ML Packages.

To assess the impact OCR has on extraction accuracy, you can now rerun it when running an Evaluation Pipeline. This requires OCR to be configured when creating an ML Package and the Environment Variable eval.redo_ocr needs to be set to true in the AI Center Evaluation Pipeline.

Training on CPU now uses a smaller model to obtain a 5x-7x speedup. However, you should expect a lower accuracy by 0-5% on CPU.

Added Minimum Confidence and Straight Through Processing Rate columns to the Evaluation.xlsx files produced by Evaluation Pipelines.

The UtilityBills ML Package has been substantially improved.

Address parsing improvement for addresses which skip 1-2 lines of text.

Improvement on extracting negative values, very large values (11 digits or more), or dates far into the future.

Added support for rotated boxes on receipts.

Concatenated spans enhancement.

Bug Fixes

  • Fixed a bug that was not returning special characters in String type fields.
  • Fixed a bug for the Passports ML Package where the date written as an ordinal number (1st, 2nd, 3rd, 4th, etc.) was not parsed correctly.

Known Issues

Retraining InvoicesJapan and InvoicesChina ML Packages using data from Validation Station is currently not supported. As a workaround, please use Google Cloud Vision OCR.

Upcoming deprecations

All public endpoints, except for UiPathDocumentOCR, FormExtractor, IntelligentFormExtractor, and IntelligentKeywordClassifier, are going to be deprecated for non-West Europe regions starting with December 1, 2021.

Public Endpoints

DocumentClassifier

Release date in AI Center Cloud: 24 November 2021

Bug Fixes

  • Fixed a bug that was throwing a prediction error at runtime.

Updated ML Packages

ML Package

Public Preview

Generally Available

DocumentUnderstanding

DocumentClassifier

Invoices

InvoicesAustralia

InvoicesIndia

InvoicesJapan

InvoicesChina

Receipts

PurchaseOrders

UtilityBills

IDCards

Passports

RemittanceAdvices

DeliveryNotes

W2

W9

Updated 4 months ago


21.10.9


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.