Document Understanding Activities

Last updated Apr 16, 2025

Generative Extractor

UiPath.DocumentUnderstanding.ML.Activities.GenerativeExtractor

Important:

This feature is currently part of an audit process and is not to be considered part of the FedRAMP Authorization until the review is finalized. See here the full list of features currently under review.

Description

Allows you to extract documents using generative models. The Generative Extractor can't extract table fields, as defined in the Taxonomy Manager, but it can extract text and tables from documents.

Tip: For good practices on how to use generative prompts, check the Generative extractor - Good practices page.

Note:

The supported languages for the generative models are the same as the used OCR engine used. For more information, check the OCR Supported languages page.

Project compatibility

Windows - Legacy | Windows

Configuration

Designer panel

Manage Field details - Select this to open the Generative Extractor Prompt wizard.

Properties panel

Authentication

The Authentication properties of this activity allow you to execute it via on-premises robots. Before configuring these properties, ensure you have fulfilled the prerequisites mentioned in the Configuring Authentication page . Once these steps are completed, you can then proceed to fill in the Authentication properties of the activity.

Runtime Credentials Asset - Use this field when you need to access Document Understanding generative extraction features while the robot is connected to a local Orchestrator or from a different tenant. You can choose to enter a Credential Asset, for authentication purposes, in one of the following ways:
- From the dropdown list, select the desired Credential Asset from the Orchestrator to which the UiPath® Robot is connected to.
- Manually enter the path to the Orchestrator Credential Asset where you store the external application credentials for accessing the generative features.
  The format of the path should be: <OrchestratorFolderName>/<AssetName>.
Runtime Tenant Url - Use this field, alongside the Runtime Credentials Asset field. Enter the URL of the tenant that the robot will connect to in order to execute the generative extraction. The URL should be in the following format: https://<baseURL>/<OrganizationName>/<TenantName>.

Common

DisplayName - The display name of the activity.

Misc

Private - If selected, the values of variables and arguments are no longer logged at Verbose level.

Server

RetryOnFailure - Automatically retries the machine learning model execution, to eliminate transient network errors. If checked, the activity retries the execution.
Timeout (milliseconds) - Specifies the amount of time (in milliseconds) to wait for a response from the server before an error is thrown. The default value is 100000 milliseconds (100 seconds).

Using the Generative Extractor wizard

The Generative Extractor Prompt wizard allows you to select a specific document type and a field. You can also select an optional value to further specify the corresponding field details. Moreover, the wizard enables you to assign a different generative extractor type for each document type, allowing customization to accommodate the varying sizes and layouts of your documents.

The prompt is used to identify the fields to be extracted, provided as key-value pairs, where the key represents the name of the field and the value a description for it, helping the extractor identify the corresponding value. The same field details cannot be used for different fields within the same document type.

Figure 1. The Generative Extractor Prompt

Select a Document Type and Fields from the list of document types defined. The field selection is done in the Configure Extractors wizard and the prompt is defined in the Generative Extractors Prompt wizard.
Optionally, you can choose from three types of generative extractors per document type. The generative extractor options are:
- Long Document Simple Layout Extractor
- Long Document Complex Layout Extractor
- Short Document Complex Layout Extractor
Add an optional value to define the field details. This can be a short description of the document type. The maximum number of characters allowed is 1000.