Document Understanding
2022.10
false
Banner background image
Document Understanding User Guide
Last updated Mar 13, 2024

Install and Use

This page describes how to deploy and configure Document Understanding, as well as special instructions on how to use Document Understanding on Automation Suite.

Dependencies

Document Understanding has a dependency on AI Center, meaning that AI Center always needs to be installed if Document Understanding is installed.

Also, Orchestrator must be activated before using Document Understanding.

Hardware Requirements

Before starting the Document Understanding installation, make sure to check and satisfy all requirements for Automation Suite for single-node and for multi-node here.

A GPU is strongly recommended for Document Understanding in one of the following scenarios:

  • If you retrain the Document Understanding models (DocumentUnderstanding - the general model, Invoices, Receipts, etc.) on AI Center.

    Training on CPU is 5-7 times slower and model performance degrades compared to training on GPU.

  • If you run UiPathDocumentOCR (non-edge version) on AI Center to process more than 2 million pages a year.

    If you do not use a GPU, slow performance may impact the product experience.

    For more details about how to provision a GPU, see Adding a dedicated agent node with GPU support.

SQL Server Requirements

Document Understanding requires the FullTextSearch feature to be enabled on the SQL server. Otherwise, the installation fails without an explicit error message.

Online Installation

For more information on installing Document Understanding in an online environment, see the following guides:

The process is mostly the same as installing other services, and the only requirement is to ensure that AI Center and Document Understanding are enabled.

  • If you are using the interactive installer, please make sure to select both products following the steps.
  • If you are not using the interactive installer, please set AI Center and Document Understanding enabled in the config file before installation, or in ArgoCD after installation.

    Sample configuration file is included in the Document Understanding configuration file page.

Offline Installation

For more information on installing Document Understanding in an offline environment, see the following guides:

The changes below are required for a successful installation of Document Understanding in an offline environment:

  1. Make sure that AI Center and Document Understanding are set to enabled in the config file before installation, or in ArgoCD after installation. Please make sure handwriting is enabled as well in the config file if you want to use the feature.

    Sample configuration file is included in the Document Understanding configuration file page.

  2. Make sure that the Document Understanding bundle is downloaded and installed.

    For more information on downloading and installing the Document Understanding bundle, check the ML Packages Offline Installation page.

Resources

Configuration file

Check the Document Understanding configuration file here.

Access to the models

Please access Form Extractor and Intelligent Keyword Classifier, it with the below public URL:

  • <FQDN>/du_/svc/formextractor
  • <FQDN>/du_/svc/intelligentkeywords
Note: When using a public URL, replace the <FQDN> placeholder with the actual environment information.For example <FQDN>/du_/svc/formextractor becomes https://servicefabricserver.domain.com/du_/svc/formextractor when used in a workflow.

Enable or disable Document Understanding

As a post-installation operation, you can enable or disable Document Understanding. More details can be found here.

Enable or disable OCR for Chinese, Japanese, Korean

If you want to use the OCR for Chinese, Japanese, Korean endpoint in an offline environment, you need to install the offline bundle by following these instructions, and once the bundle is installed, you have to enable the OCR in ArgoCD.

Note:
  • When OCR for Chinese, Japanese, Korean is used in Document Understanding, make sure that you've configured the activity with the public endpoint of the OCR, and the Document Understanding API Key.
  • OCR for Chinese, Japanese, Korean is only supported in Document Understanding deployed in Automation Suite. This is not supported in Document Understanding deployed in AI Center connected to an external Orchestrator.

Here are the steps that you need to follow in order to enable the OCR in ArgoCD:

  • Access ArgoCD.
  • Open the Document Understanding framework.
  • Click on the Parameters tab and go to du-cjk-ocr.enabled .
  • Click on the Editbutton, set the value to TRUE, and click on the Save button.
Note: The endpoint for OCR for Chinese, Japanese, Korean in an Automation Suite installation is constructed as {Cluster_FQDN}/du_/cjk-ocr/.

Troubleshooting

Check the Document Understanding-related issues here.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.