automation-suite
2022.4
false
UiPath logo, featuring letters U and I in white
Automation Suite Installation Guide
Last updated Nov 21, 2024

Document Understanding configuration file

documentunderstanding is a property in the Automation Suite's configuration file, cluster_config.json. It contains configurable values that control the behavior of the Document Understanding service. The installer generates the default values. Additional changes can be made to further configure the Document Understanding service. If you need to change any settings related to Document Understanding, the documentunderstanding section in cluster_config.json can be edited and the installer can be re-run.

Alternatively, the same changes can be made in the UiPath app in ArgoCD.

cluster_config.json

Document Understanding config

"documentunderstanding": {
    "enabled": Boolean,
    "datamanager": { 
      "sql_connection_str" : "String"
    }
    "handwriting": {
      "enabled": Boolean,
      "max_cpu_per_pod": "Number"
    }
  }"documentunderstanding": {
    "enabled": Boolean,
    "datamanager": { 
      "sql_connection_str" : "String"
    }
    "handwriting": {
      "enabled": Boolean,
      "max_cpu_per_pod": "Number"
    }
  }
Note:

The data manager SQL connection string is optional only if you want to overwrite the default database with your own.

Handwriting is always enabled for online installation.

The full config example

"documentunderstanding": {
    "enabled": true,
    "datamanager": {
      "sql_connection_str": "mssql+pyodbc://testadmin:myPassword@mydev-sql.database.windows.net:1433/datamanager?driver=ODBC+Driver+17+for+SQL+Server",
    },
    "handwriting": {
      "enabled": true,
      "max_cpu_per_pod": "2"
    }
  }"documentunderstanding": {
    "enabled": true,
    "datamanager": {
      "sql_connection_str": "mssql+pyodbc://testadmin:myPassword@mydev-sql.database.windows.net:1433/datamanager?driver=ODBC+Driver+17+for+SQL+Server",
    },
    "handwriting": {
      "enabled": true,
      "max_cpu_per_pod": "2"
    }
  }
Note: The value for max_cpu_per_pod is by default 2, but it can be adjusted according to your needs. For more information on how to do this, see the (optional) max CPU per pod Parameter section.

Configurable values

datamanager.sql_connection_str

  • Connection string for datamanager
  • Required: False.
  • This property is generated and populated by the installer, you do not need to set it unless you want to override the default connection string. For more details about connecting to SQL please refer to the Database configuration page.

handwriting

  • Settings for the handwriting recognition functionality (part of IntelligentFormExtractor)
  • Required: False.

handwriting.enabled

  • Setting this to true creates the resources necessary for performing handwriting recognition. This needs to be true to use IntelligentFormExtractor.
  • Required: False
  • This property is always enabled for online installation, and disabled for offline (air-gapped) installation. For air-gapped installation, you need to install the Document Understanding offline bundle before enabling handwriting.

handwriting.max_cpu_per_pod

  • The maximum amount of CPUs each container is allowed to use. The recommended value is 2.
  • Required: False.
  • Default: 2.
If you plan to use Intelligent Form Extractor with handwriting detection feature, you may need to adjust the handwriting.max_cpu_per_pod parameter for more processing power.

The following factors are required to calculate the right sizing:

  • total volume of documents/year = V
  • expected number of handwriting shreds/doc = S
  • days in which the workflow processes documents (workdays, all days, weekends, etc) = d
  • hours in which the workflow processes documents = h
  • Number of CPUs = (V x S / (d x h)) / 1500

As an example, if you expect to have 1 million documents to process for a year using Intelligent Form Extractor for handwriting detection, with 50 shreds on average, running weekdays from 00:00 to 08:00 (8hr), the calculation would be:

Number of CPUs = (1,000,000 x 50 / (250 x 8)) / 1500
               = 25,000 / 1500
               = 17 CPUsNumber of CPUs = (1,000,000 x 50 / (250 x 8)) / 1500
               = 25,000 / 1500
               = 17 CPUs
For the single-node evaluation mode, you need to adjust the max_cpu_per_pod parameter to 17.
For the multi-node HA-ready production mode (3 nodes), adjust the max_cpu_per_pod parameter to 5-6.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.