document-understanding
latest
false
UiPath logo, featuring letters U and I in white

Document Understanding User Guide

Automation CloudAutomation Cloud Public SectorAutomation SuiteStandalone
Last updated Dec 12, 2024

Forms AI

Forms AI is part of Document UnderstandingTM and can be used for uploading and processing structured forms with standard layouts and fields.

Create Forms AI

Forms AI is the first extraction method available in Document Understanding. Read more information about how to create a new project in Document Understanding.

Once a project is created, you need to follow the next steps for creating a document type using Forms AI within the project.

  1. Open your project.
  2. Select the New Document Type button.
  3. Add a name for your document type.

If you want to train your document classifiers straight from Document Understanding, than you can use the One Click Classification functionality.

Note: Fixed layout forms used with Forms AI can each have a maximum length of five pages.

Convert a Forms AI to a semi-structured document type

You can convert a Forms AI Document Type into a Semi-Structured Document Type.

When you convert a Forms AI document type to a Semi-Structured (Document Manager) document type, you can use all the functionality available in Document Manager

The converting option is ideal for complex scenarios to train a more powerful Deep Learning Machine Learning Model.

How to convert a Forms AI session

There are two options you can choose from if you decided to convert a Forms AI session into a Document Manager session.

From the project's Document types list

You can convert a Document Type straight from the project's Document Types list.

Access the Open access menu of the document type you want to convert and click the Convert to Semi-Structured option. A popup window is displayed asking you to confirm the action.

Attention:

Once a Document Type has been converted, you cannot reverse the action.

From an open Forms AI session

Open an already created Forms AI session in order to convert it to a Semi-Structured session.

From the opened session click the Access menuthen click the Convert to Semi-Structured option.

Attention: Once a Document Type has been converted, you cannot reverse the action.

The Convert to Semi-Structured button is not displayed if the project does not have an AI Center link.

Import documents

Once the new Forms AI is created, a new window opens, requiring you to import data. You can import a minimum of two documents and a maximum of twenty documents, each with a maximum of five pages. Drag and drop or browse for the files to upload them.

Import documents is another way to convert from Forms AI to Semi-structured AI Document type. An option appears if you try to upload more than 20 documents, or if any of the documents has more than 20 pages. A popup is displayed on the screen, asking you if you want to convert the FormsAI session into a Semi-structured one.

Automatically extracted fields should also be checked for Content Type accuracy. For example, if a date field was automatically extracted, then the Content Type should be date. Any inaccuracies should be manually corrected.

Management bar

At the top of the page you can find the management bar. The management bar enables you to perform multiple operations: navigate between documents, delete/restore a document, search/filter documents, run AI model predictions, import, and export documents.

Here are the items available in the management bar:

Item

Icon

Description

Navigation

Navigate between documents that match the active filter.

In between the two arrows, a counter is displayed. It illustrates the number of the current document out of the total number of documents that match the active search/filter.

Search and Search in document

Search - initiate a search or filter the documents. Filter is also applied when exporting documents. You can filter by words from a document or by document names.

Search in document - initiate a text search inside the document by clicking on the or using the shortcut Ctrl + Shift + F

Delete / Restore

/

Delete or restore a document. Deleted documents can be found under the deleted filter.

Import

Open Import data dialog box.

Export

Open Export files dialog box.

Document name and type

n/a

The name of the currently active document and its type.

Download

The option is available in the drop-down next to the document name.

Click the icon to download a Zip file containing the original document. Besides the original document, all pages converted internally by Document Manager to .jpeg images are downloaded as well.

Permanently delete

docs image

The option is available in the dropdown next to the document name.

Permanently deletes individual files. The .pdf and all its .jpeg images are deleted from the AI Center dataset and all the metadata is deleted from the database.

When clicking the button, a pop-up message appears asking you if you are sure you want to permanently delete the document. Click OK to continue or Cancel to revert to the previous screen.

Predict

Run AI model predictions and display the results.

After configuring Prelabelling, the button is enabled in the management bar. Click it to prelabel the current document.

At the moment, using the Predict option with Public Endpoints prelabels only the first 10 pages of a document. This is a known issue and a fix is in the working. Using the Predict option with ML Skills in AI Center, however, does not impose such a limitation.

Publish

Publishes the Forms AI extractor and creates the associated link, available in the project's list of extractors.

Settings

Configure OCR and Prelabelling settings or access the How to... panel.

The settings button has two available options:

  • Settings where you can see the OCR configuration which is automatically populated from the Project's Settings.
  • Accessibility mode makes the raw values visible
  • How to... where you can find all the available shortcuts and controls.

Session

n/a

The name of the current session, found at the top of the page, next to the UiPath® Document UnderstandingTM logo.

Let's go a little bit deeper in understanding the difference between Delete and Permanently Delete options.

  • The Delete option deletes the files, without permanently removing them from your project. You can still find the deleted files under the deleted filter from the Search bar, and restore them by using the Restore option.
  • The Permanently Delete option deletes the selected files without any possibility of restoring them.

The Settings button has two available options:

  • Settings - where you can configure the OCR service
  • How to... - which has the purpose of a help menu

Column fields

Create a new column field

  1. Click docs image in the table section at the top of the page to add a new Column field. The Create Column Field window is displayed.
  2. Fill in a unique name for the field in the Enter unique field name field. The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore _ and dash -.
  3. Click OK.

Edit a column field

Click the Edit field button. The available options for column fields can be found in the table below.

Option

Description

Field name

The unique name for the field.

The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore _ and dash -.

Content type

The content type of a field:

  • string: appropriate for company names or addresses, as well as payment terms, or for any other field where the RPA developer prefers to build the parsing or formatting logic manually, in the RPA workflow.
  • number: appropriate for amounts or quantities, with intelligent parsing of the decimal/thousands separators.
  • date: the model parses, formats and unifies the output in a yyyy-mm-dd format.
  • phone: appropriate for phone numbers. Formatting removes letters and parentheses, and replaces spaces with dashes.
  • id-no: appropriate for alphanumeric codes, numbers of IDs, it is similar to the string content type, but includes cleaning of any characters coming before a colon :. If the id number you need to extract might contain colon : characters, please use string as content type instead to avoid data loss.

Shortcut

The shortcut key for the field. One or two keys allowed.

Split items

Select this checkbox if you want this field to be used as a delimiter between line items or rows in a table. Any line on which this field appears is considered to be a new line item or row in the table. Most commonly, this is used on Line Amount fields on Invoice line items.

Click Save to save your settings.

Grouping table rows is different than in the AI Center Document Manager. Here, the rows are automatically grouped based on the state of the Split items checkbox on each column fields. This is only relevant for tables with rows that contain multiple lines of text. In this case you must check the Split items checkbox on any of the fields that have only one line for each table row. For instance, on an invoice, the line item amount would be a typical field on which you might check the Split items option. In the context of Forms AI you would do the same thing on forms.

Delete a column field

To delete a column field, follow these steps:

  1. Click the Edit fielddocs image button corresponding to the column field you want to delete.
  2. Click the Delete button.
  3. Click OK.
  4. The column field and its associated labelled data are deleted.

Fields

Create a new field

  1. Click docs image on the right pane in the Fields section. The Create a new regular field window is displayed.
  2. Fill in a unique name for the field in the Enter unique field name field. The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore _ and dash -.
  3. Click OK.

Delete all fields

  1. Click docs image in the table section at the top of the page to delete all created fields. Use this function for deleting all fields, including Regular and Column fields, and all the labels on the documents in the current Document Type collection. This action cannot be undone.
  2. Click the Delete button from the Delete all fields dialog box.

Edit a field

Click the Edit field button. The available options for regular fields can be found in the table below.

Option

Description

Field name

The unique name for the field.

The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore _ and dash -.

Content type

The content type of a field:

  • string: appropriate for company names or addresses, as well as payment terms, or for any other field where the RPA developer prefers to build the parsing or formatting logic manually, in the RPA workflow.
  • number: appropriate for amounts or quantities, with intelligent parsing of the decimal/thousands separators.
  • date: the model parses, formats and unifies the output in a yyyy-mm-dd format.
  • phone: appropriate for phone numbers. Formatting removes letters and parentheses, and replaces spaces with dashes.
  • id-no: appropriate for alphanumeric codes, numbers of IDs, it is similar to the string content type, but includes cleaning of any characters coming before a colon :. If the id number you need to extract might contain colon : characters, please use string as content type instead to avoid data loss.

Shortcut

The shortcut key for the field. One or two keys allowed.

Multi line

General

Click Save to save your settings.

Delete a regular field

To delete a regular field, follow these steps:

  1. Click the Edit fielddocs image button corresponding to the regular field you want to delete.
  2. Click the Delete button.
  3. Click OK.
  4. The field and its associated labeled data are deleted.

Document view and labelling

For multi-page documents, you can scroll naturally through the pages as in any PDF viewer. To zoom in or out, use Ctrl + mouse scroll.

You can label documents by selecting the word boxes and assigning them to a field by pressing a key. You can also right-click the word box and verify the extracted information.

For more details on how to label documents, visit this page.

Checkboxes

Checkboxes that are available in Forms AI should be manually labelled for each field. Checkboxes from tables can also be labelled by using the Column Fields option. When a checkbox is labelled in Forms AI, both checked and unchecked boxes should be considered.

Here you can find more detailed information about how to label checkboxes.

You can choose to integrate your Document Understanding project into an RPA workflow by following the steps presented here.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.