document-understanding
latest
false
UiPath logo, featuring letters U and I in white

Document Understanding User Guide

Last updated Apr 10, 2025

Generative features

Generative AI is a form of AI technology that leverages machine learning (ML) models to create and generate new content, data, or information.

The key to most generative AI tasks are large language models (LLMs). These are ML models that are trained on a vast amount of text data, designed to generate human-like text. LLMs can also understand and respond to prompts by completing sentences or paragraphs in a human-like manner.

Generative annotation

Primarily applied during the automatic annotation process of documents in the Build step, these generative models accelerate taxonomy design and help in training models efficiently.

Pre-annotation in Document Understanding is done using a combination of generative and specialized models, based on the document type's schema. The schema clearly defines the fields you want to extract from a particular document type.

To get a deeper understanding of how Generative Annotation works and how you can use it efficiently in your projects, check the Annotate documents page.

Generative extraction

Generative extraction is a crucial feature within Document UnderstandingTM that uses the power of generative AI models. These models are configured using activities and are primarily used at runtime for data extraction.

Generative extraction is capable of deciphering and extracting specific information from unstructured or semi-structured documents. For instance, it can scan through an invoice and accurately retrieve details such as the date, billed amount, and company name. This enables fast, efficient, and highly accurate information gathering from various types of documents.

Related activities

Tip: For more information on how to use generative extraction activities more efficiently, check the Generative extractor - Good practices page.
There are several activities in place to help you benefit from generative extraction features:

You can also use Document Understanding APIs to leverage generative extraction features.

Supported models

The generative extractors available under the Generative Predefined project can be used for the documents described in the following table:
Note: Long Document Complex Layout and Short Document Complex Layout extractors are not currently available in Automation CloudTM for Public Sector environments (FedRamp).
Table 1. Supported scenarios for generative extractors
ExtractorRecommended scenarioProviderRegion availabilityMulti-modal support1
Long Document Simple Layout ExtractorRecommended for long form documents with mostly text and headings. For example, you can use the Long Document Simple Layout Extractor on documents such as lease agreements, master service agreements, or other similar documents. Azure OpenAIUnited Kingdom, Australia, India, Canadanot available
Long Document Complex Layout ExtractorRecommended for long-form documents with complex layouts, such as images, handwritten text, form elements, or distinctive layouts such as floating callout boxes. You can use this extractor on long-form documents like insurance policies, which usually have complex layouts. Azure OpenAIUnited States, European Union, Japan, Singaporeavailable
Short Document Complex Layout ExtractorRecommended for shorter documents (of maximum 20 pages) featuring images, handwritten text, form elements, or complex layouts, such as floating callout boxes. You can use this extractor on documents like government IDs or healthcare intake forms that typically have shorter but more complex layouts. Azure OpenAIUnited States, European Union, Japan, Singaporeavailable

1 Multi-modal support refers to the ability to extract different types of data inputs, such as text, images, handwritten text, etc.

Generative classification

Generative classification uses AI models to automatically classify documents immediately after they are uploaded.

This automatic classification process leverages ML models to 'read' the content of a document, understand its context, and consequently classify it into predefined categories. This way, the system can handle and organize multiple types of documents efficiently.

By accurately classifying unstructured or semi-structured documents, Generative Classification improves the document processing workflow, saves time, and enhances the overall document management.

Related activities

Tip: For more information on how to use generative classification activities more efficiently, check the Generative classifier - Good practices page.
There are several activities in place to help you benefit from generative classification features:

You can also use Document Understanding APIs to leverage generative classification features.

Generative validation

Generative validation is a distinctive feature in Document Understanding that plays an important role during the validation process. This feature is primarily used after the extraction step to validate the confidence score for the extraction made using specialized models.

When a ML model's confidence score for a document extraction is low, generative validation is used to cross-check the output. This validation process involves both the specialized and generative ML models working together to ensure accuracy.

If both models yield the same output, human validation can be bypassed, leading to a significant enhancement in the time efficiency of validation. This process not only saves valuable time in the document validation step but also improves the performance of your models by employing a secondary generative model to cross-verify the output, ensuring a higher level of accuracy.

Related activities

There are several activities in place to help you benefit from generative validation features:
  • Document Understanding activities package:
  • IntelligentOCR activities package:

You can also use Document Understanding APIs to leverage generative validation features.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
© 2005-2025 UiPath. All rights reserved.