document-understanding
2024.10
true
UiPath logo, featuring letters U and I in white

Document Understanding User Guide

Automation CloudAutomation Cloud Public SectorAutomation SuiteStandalone
Last updated Dec 18, 2024

Document Understanding activities

With DocumentUnderstanding.Activities, you can manage documents using a unified approach, by storing every information from the Document UnderstandingTM process within a Document Data object. Moreover, DocumentUnderstanding.Activities is integrated with Modern projects, enabling reusability.

Known limitations

We are aware of the current limitations that the DocumentUnderstanding.Activities package has, as we plan to resolve soon. The following features are not yet available:

  • Support for splitting documents.
  • Business rules.
  • Training models.
  • Support for models from tenants other than where the automation is deployed.
  • Support for Automation Suite.

The sections below describe each phase of the document understanding process using Document Understanding activities.

1. Processing Documents

Processing documents involves preparing the PDF files for extraction. With the Document Understanding activities, you can:

  • Extract text, images, specific pages, or merge multiple PDFs.
  • Change the password of encrypted PDF documents

To process PDF files with Document Understanding activities, use the following activities:

ActivityDescription
Set PDF PasswordChanges the password of a specified PDF file.
Merge PDFsJoins a collection of file objects.
Get PDF Page CountProvides the total number of pages in a PDF file.
Extract PDF TextExtracts the text from a PDF document.
Extract PDF ImagesThe activity extracts all the images it finds in the PDF file.
Extract PDF Page RangeExtracts a specified range of pages from a PDF document.

2. Extracting data

Use the Extract Document Data activity to:

  • Extract data from an input file saved as a Document Data object.
  • Store the extraction results into the same Document Data object.

Document Data is a resource that serves both as an input and output variable, within your Document Understanding workflows. The Document Data object holds all the necessary information about a single document. If you classify a document, the object includes the Document Type. If you extract data, the object contains the corresponding extracted fields. Irrespective of the activity, Document Data consistently contains the document's text and DOM (Document Object Model).

Provide the file as input only the first time you use Extract Document Data. The output, known as Document Data, should be reused throughout the workflow to prevent re-digitizing the same file, which costs 1 AI Unit per page.

Visit Document Data for more details.

3. Classifying data

Use the Classify Document activity to:

  • Choose from various classification models.
  • Output the classified data into a Document Data object.

4. Validating data

The validation step of the document process means sending the processed documents for validation with members of your team within Action Center. You can also configure the process of validating documents in Action Center using the following activities:

ActivityDescription
Create Validation TaskCreates a validation action to suspend the workflow until it is completed.
Wait for Validation Task and ResumePauses the action until validation is complete and then resumes it automatically.
Create Validation Task and WaitCreates an action in Action Center for visualizing and modifying extraction results, and pauses the workflow until the action completes.
Create Classification Validation TaskCreates an action to verify classified Document Data without waiting for its completion.
Create Classification Validation Task and WaitCreates an action to verify classified data and waits for its completion before resuming the workflow.
Wait for Classification Validation Task and ResumeWaits for a Classification Validation action to complete before resuming the workflow.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.