Document Understanding configuration file

documentunderstanding is a property in the Automation Suite's configuration file, cluster_config.json. It contains configurable values that control the behavior of the Document Understanding service. The installer generates the default values. Additional changes can be made to further configure the Document Understanding service. If you need to change any settings related to Document Understanding, the documentunderstanding section in cluster_config.json can be edited and the installer can be re-run.

Alternatively, the same changes can be made in the UiPath app in ArgoCD.

cluster_config.json

Document Understanding config

"documentunderstanding": {
    "enabled": Boolean,
    "datamanager": { 
      "sql_connection_str" : "String"
    }
    "handwriting": {
      "enabled": Boolean,
      "max_cpu_per_pod": "Number"
    }
  }"documentunderstanding": {
    "enabled": Boolean,
    "datamanager": { 
      "sql_connection_str" : "String"
    }
    "handwriting": {
      "enabled": Boolean,
      "max_cpu_per_pod": "Number"
    }
  }

Note:

The data manager SQL connection string is optional only if you want to overwrite the default database with your own.

Handwriting is always enabled for online installation.

The full config example

"documentunderstanding": {
    "enabled": true,
    "datamanager": {
      "sql_connection_str": "mssql+pyodbc://testadmin:myPassword@mydev-sql.database.windows.net:1433/datamanager?driver=ODBC+Driver+17+for+SQL+Server",
    },
    "handwriting": {
      "enabled": true,
      "max_cpu_per_pod": "2"
    }
  }"documentunderstanding": {
    "enabled": true,
    "datamanager": {
      "sql_connection_str": "mssql+pyodbc://testadmin:myPassword@mydev-sql.database.windows.net:1433/datamanager?driver=ODBC+Driver+17+for+SQL+Server",
    },
    "handwriting": {
      "enabled": true,
      "max_cpu_per_pod": "2"
    }
  }

Note: The value for max_cpu_per_pod is by default 2, but it can be adjusted according to your needs. For more information on how to do this, see the (optional) max CPU per pod Parameter section.

Configurable values

datamanager.sql_connection_str

Connection string for datamanager
Required: False.
This property is generated and populated by the installer, you do not need to set it unless you want to override the default connection string. For more details about connecting to SQL please refer to the Using the configuration file page.

handwriting

Settings for the handwriting recognition functionality (part of IntelligentFormExtractor)
Required: False.

handwriting.enabled

Setting this to true creates the resources necessary for performing handwriting recognition. This needs to be true to use IntelligentFormExtractor.
Required: False
This property is always enabled for online installation, and disabled for offline (air-gapped) installation. For air-gapped installation, you need to install the Document Understanding offline bundle before enabling handwriting.

handwriting.max_cpu_per_pod

The maximum amount of CPUs each container is allowed to use. The recommended value is 2.
Required: False.
Default: 2.

If you plan to use Intelligent Form Extractor with handwriting detection feature, you may need to adjust the handwriting.max_cpu_per_pod parameter for more processing power.

The following factors are required to calculate the right sizing:

total volume of documents/year = V
expected number of handwriting shreds/doc = S
days in which the workflow processes documents (workdays, all days, weekends, etc) = d
hours in which the workflow processes documents = h
Number of CPUs = (V x S / (d x h)) / 1500

As an example, if you expect to have 1 million documents to process for a year using Intelligent Form Extractor for handwriting detection, with 50 shreds on average, running weekdays from 00:00 to 08:00 (8hr), the calculation would be:

Number of CPUs = (1,000,000 x 50 / (250 x 8)) / 1500
               = 25,000 / 1500
               = 17 CPUsNumber of CPUs = (1,000,000 x 50 / (250 x 8)) / 1500
               = 25,000 / 1500
               = 17 CPUs