Activities - Present Validation Station

Description

Opens the Validation Station, which enables users to review and correct document classification and automatic data extraction results.

Project compatibility

Windows-Legacy | Windows

Using the Create Document Validation Action

Configuration

Properties panel

Common

DisplayName - The display name of the activity.

Input

AutomaticExtractionResults - The automatically generated extraction results, stored in an ExtractionResult proprietary variable. If a variable is added to this field, the Validation Station displays the results of the automatic extraction, enabling you to review and modify them. If left empty, the Validation Station contains no automatically extracted data. This field supports only ExtractionResult variables.
DocumentObjectModel - The Document Object Model you want to use to validate the document against. This model is stored in a Document variable and can be retrieved from the Digitize Document activity. Visit Digitize Document to learn how to achieve this. This field supports only Document variables.
DocumentPath - The path to the document you want to validate. This field supports only strings and String variables.
Note: The supported file types for this property field are: .png, .gif, .jpe, .jpg, .jpeg, .tiff, .tif, .bmp, and .pdf.
DocumentText - The text of the document itself, stored in a String variable. This value can be retrieved from the Digitize Document activity. Visit Digitize Document to learn how to achieve this. This field supports only strings and String variables.
Taxonomy - The Taxonomy against which the document is to be processed, stored in a DocumentTaxonomy variable. This field supports only DocumentTaxonomy variables.

Misc

FieldsValidationConfidence % - Set the upper limit confidence score to be used when rendering the Validation Station.
Private - If selected, the values of variables and arguments are no longer logged at Verbose level.
ShowOnlyRelevantPageRange - If selected, only the page range mentioned in the extraction results is shown and the pages that are outside the range are hidden.

Output

ValidatedExtractionResults - The extraction results of the human validation process, stored in an ExtractionResult variable.
Important: In case you use an Intel Xe GPU and Validation Station is not displayed properly, we recommend updating the graphics driver to the latest version. Visit Intel support for more information.

Using the Validation Station

The Validation Station enables you to review and correct automatically extracted data from files, or manually process files for data extraction. The Validation Station, once opened, presents all extracted information along with the file being processed.

Figure 1. Overview of the Validation Station

The fields that are visible in the Validation Station are the ones defined in the Taxonomy used in your workflow.

Document View

The right area of the Validation Station contains an interactive version of the original document, in which text or document sections can be selected, and words can be clicked based on the output of the digitization process. This area also contains options for zooming in and out, selecting and rotating pages, searching through the document, or switching to text view.

Figure 2. Overview of the right area of the Validation Station that is interactive

The following table shows the options in the right part of the Validation Station screen, and what actions you can perform by using them.

Table 1. Available options in the Validation Station and their descriptions
Option	Description
	Displays all the available keyboard shortcuts supported by the Validation Station. - Keyboard shortcuts - Hides the extracted tokens - Switches the panel side from left to right
	Toggles between the text view and image view of the document. - Image view - Text only view
- Text Note: Active only when the Text only view option is active	Sets the selection mode while in text view: - Text - Tokens
	Sets the selection mode while in image view: - Tokens - Custom area - Choose after selection
	Rotates the current page clockwise. Note: The Rotate option is available only in Image view.
	Initiates a search between results in the document used by the Validation Station.
	Resets the zoom level on the document. This option is enabled only if the document was previously zoomed in or out.
	Zooms in on the document.
	Zooms out on the document. Note: To zoom in or out, you can also use the CTRL + scroll mouse wheel combination: CTRL+scroll up to view a specific section of the document; CTRL+scroll down to view a larger section of the document.

Interacting with the document in the Validation Station

This section describes how to use the available options for interacting with documents in the Classification Station.

To select a part of the document using the custom area option within the image view:

Ensure that Image view is selected.
Select Tokens and then select Custom area.
Select the desired area in your document.
Go to the document's more options on the left side, and choose if you want to Change reference or Remove reference.
Figure 3. Animated image showing how to perform selection in image view

Similarly to how you select a part of the document using the custom area option within the image view, you do the same within the text view. The only difference is that you ensure that Text view is selected.

Figure 4. Animated image showing how use the custom area selection in text view

Keyboard shortcuts

You can use keyboard shortcuts to optimize the interaction with the Validation Station. We encourage you to use them as much as possible. You can view them in the Keyboard Shortcuts pop-up.

To start using keyboard shortcuts, go to More options, select Keyboard shortcuts, and then select Toggle keyboard shortcuts.

The following table shows all the available keyboard shortcuts and their corresponding descriptions.

Table 2. Validation Station keyboard shortcuts and their descriptions
	Description
n	Moves to the next field
p	Moves to the previous field
f v	Marks a value as validated
f c	Changes the extracted value
f z	Reverts to the previous value
f a	Adds an additional value
f s	Toggles between suggestions
ESC	Exits edit mode (for Fields and Tables) Collapses the derived parts (for Fields) Deselects a line (for Table Selection) Exit table selection mode (for Table Selection) Do not save unconfirmed fields
DEL	Removes the selected value (for Fields) Removes the selected line (for Table Selection)
CTRL SHIFT ENTER	Save unconfirmed fields
CTRL SHIFT S	Save data as draft
Alt p	Toggle PDF Viewer focus
d +	Zooms in
d -	Zooms out
d 0	Resets zoom
d r	Rotates the page clockwise
d t	Toggles the text mode
/	Initiates a search
d s	Changes selection mode
d a	Clears the drawn anchor selection
d h	Toggles the extracted tokens
s↑	Move selected line right
s ←	Move selected line left
s ↑	Move selected line up
s ↓	Move selected line down
s d	Duplicate the selected line
s v	Vertical line
s f	Horizontal line
s a	Auto detect by mouse movement
s t	Hand tool - move and delete lines
?	This screen
!	Report document as exception
CTRL ENTER	Save data
CTRL DEL	Discard all current changes
Right arrow →	Moves to the right cell
Left arrow ←	Moves to the left cell
Upward arrow ↑	Moves to the top cell
Downward arrow ↓	Moves to the bottom cell
t v	Marks a cell as validated
t c	Changes the extracted cell
t z	Reverts to the previous cell value
t d	Discards changes in tables
t DEL	Removes the selected cell
t ESC	Close the table editor
1 2 3 4 5 6 7 8 9 q w e r y a g h j k l z x c v m @ # $ % ^ & *** ( ) [ ] {	Use the key associated with each field to assign values to them (letters are case insensitive). Use the same key to focus on a field if no selections are made. The o key is reserved for the Document Type field assignment.

Figure 5. Animated image showing the navigation to the Keyboard shortcuts pop-up

Select More Options in the right area of the Validation Station, and then select Hide extracted tokens to have a clean view panel and hide the highlights of the extracted tokens.

Figure 6. Animated image showing the selection of the Hide extracted tokens option

Data Extraction Section

The left area displays the document type you have selected for the current validation and enables you to select the state of each element and link it to its corresponding word or area in the document.

The confidence level of the extracted information can be displayed by OCR or Extraction.

The OCR Confidence level is given by the OCR engine used for extraction in the workflow. If the used OCR doesn't report any confidence levels, then N/A is displayed instead of percentages.

The Extraction Confidence level is given by the extractor used in the workflow.

The confidence score should be used only for guidance purposes. You can increase the confidence score by manually validating the data.

Another way of visualizing confidence levels is by filtering them depending on a threshold set by you. To do this, select Filter fields using the selected confidence level, and then adjust the confidence level based on which you want to filter.

Figure 7. Filtering fields based on confidence level

The OCR confidence level changes individually, for each field, if you alter the reference of a certain field.

You can use the field shortcuts to assign values to a field or to toggle between fields. Once a value is assigned to a field, it is highlighted by the color of the selected field.

For the assigned value, there is a document crop displayed in the table field. This helps with better locating the area from which the value was extracted and it also serves as a means of double-checking the value by comparing it with the document crop.

Note:

The Document Type field is a special field that you can act upon in the following scenarios:

If the extraction results contain a document type, and that document type is correct, then no action is required.
If the extraction results contain a document type, and that document type is incorrect, then you have to select the correct one and provide evidence for it from within the document.
If no extraction result is provided and only one document type exists in the taxonomy, then that document type is pre-selected but needs evidencing.
If no extraction result is provided and there are multiple document types in the taxonomy, then you have to manually select the desired document type and provide evidence for it.

Automatically extracted fields have a confidence level percentage that is also color-coded, meant to help you detect fields that need assistance.

There are four levels of confidence:

Below 50%, color coded in red.
Between 50% and 85%, color coded in yellow.
Between 86% and 99%, color coded in light green.
100%, color coded in green.
To increase the confidence level, you can validate the information by manually selecting it. After you manually select a part of the document, select Options for an extracted field, and then select Change extracted value.

Figure 8. The action of manually changing the extracted field value

All fields that contain information have an Options dropdown menu that can be accessed by selecting it. A drop-down list becomes visible, displaying multiple editing options.

The Options menu includes the following options:

Change extracted value - Changes the automatically extracted value with a manually selected one. This field is active only when one or multiple values are selected from the document and are different from the original value.
Revert to previous value - Resets the field's value to its last state. This option is active only when a value was previously altered or deleted.
Mark as missing - Marks a field as missing if the information is not available in the document.

Selection Modes

There are several ways of selecting text while using the Validation Station wizard. Using them allows you to quickly navigate through the entire document and easily select the desired words for validating a field.

Here is a list of all the available selection options:

Select one word - Select the desired word.
Select consecutive words - Select the first word, then SHIFT+select the last word from range.
Select multiple disparate words - Select the first word, then CTRL+select the rest of the desired words.
Combine multiple selections - Select the first word, then SHIFT+select the last word from a range for the first selection, then hold CTRL+select and SHIFT+select to add another range, until you've completed your selections.
Area selection - Make a selection and choose the selection type:
- Tokens - Selects all words in the selected area.
- Custom area - Captures only the area and not the words in it.
- Choose after selection - Selects the entire area, with separate words, leaving you to decide the type of selection.

Other Options

Notes - This is only displayed if Validator notes for that certain field were enabled in Taxonomy Manager. Depending on how it was configured, it can be the following:
- A text field where you can add notes related to that field, such as why a certain value was chosen or if any extra checks should be performed.
- A text that cannot be edited.
- Several options in the form of radio buttons from which you can select one, depending on the situation.
Tip: Check the ExtractionResult Class page from the UiPath.DocumentProcessing.Contracts section for more information on the two methods related to validator notes, GetFieldValidatorNotes(<fieldId>) and SetFieldValidatorNotes(<fieldId>, <validatorNote>).

Note: To check which releases will include validator notes in Action Center, refer to the release notes for version 6.19.0.
Edit the field's value - Changes the content of a field by selecting that field, selecting the value, and adding the desired input.
The Undo option - Reverts the field to its prior state. Selecting this one time takes you one step back, meaning that if you had several changes on that field, multiple clicks might be required for returning to a certain value. This field is active only when a value was previously modified or deleted.
The Add option - Adds a value to the field by using the Custom area or Tokens selection. The option becomes available when a selection is made in the document and differs from the one in the field. The selection can be made for multi-value fields at all times, and for single-value fields only if no value is present for that field. First select the part of the document and then the Add option.
The Validation option - Confirms the information included into the field. Once confirmed, a Validated tag is added to the field.
Once a field is manually validated, you can still check the original value of that field by selecting the Extraction confidence level. This functionality is available only for Extraction confidence level.

Figure 9. Selecting the Extraction confidence level

The interface of the Validation Station is interactive, meaning that when a field is selected on the left side, the right side moves the focus on it by highlighting it.

The Add Extra option - Enables you to select and add additional values from the document to a specific field.
The Add option - Enables you to add a value to a field without requiring reference from the document.

Table Fields - Cell Level Processing

The extraction confidence level is available for each extracted cell, for both OCR and Extractor used in the workflow. Toggle between them from the upper left side of the Validation Station.

The following table shows the options available for a table field, and their descriptions.

Table 3. Options available for a table field and their descriptions
	Description
	Indicates the table's area in the document.
	Creates a new table and opens the table extraction tools.
	Opens a drop-down list with the following options: Revert to previous value - Returns to the previous value. It is active only if the table's data has been previously altered. Remove value - Marks the table field as missing.
	When selected, confirms the data accuracy.
	Enabled when the field's value is missing.
	Enabled when the initial extracted value had been altered or deleted. When selected, it restores the previous value.
	Marks a field as missing.

All table fields have a dropdown menu with the following options:

Change extracted value - Selects a new value from the document and replaces the initial one. This field becomes available only when the newly selected value is different from the original selection.
Revert to previous value - Replaces the actual value with the previous one. This field is active only when altering the initial value.
Mark as missing - Marks a field as missing if the info is not available in the document.

The following table shows the dropdown menu options available at the end of each row.

Table 4. Dropdown menu options and their descriptions
	Description
	Transforms the selected row in the table's header. The row cannot be seen in the table's body anymore, but the information can be visualized any time a header's field is selected.
	Highlights the selected row.
	Extracts a new row and places it below the selected one. It enables the custom selection tool and offers you the possibility to manually select the new row. Once the area of the new rows is selected, you must define each column. Do this by using the available options presented in the table below.
	Inserts a new empty row above the selected one. The row is automatically added to the table, except that all fields are marked as Not extracted. You have to manually select the value from the document and add it by using the Add extracted value option.
	Applies the same principle as for the Insert row above option, the only difference being that the row is inserted below and not above the actual selection.
	Deletes the row.

Once a field from the table is edited or reviewed, the confirmation box changes its appearance. To confirm the data you must check the box.

Note: The overall confidence of the table is the lowest confidence from the cells within.

Table Fields - Table Level Processing

A table can be manually selected and defined, straight from the Validation Station wizard. If no table is selected, or if you are not happy with the automatic selection, then you can use the options available on the dropdown menu found at the end of the first row.

Note:

If the Value of a table cell is not extracted, you can manually add a value into that table cell, by going into the Selecting the Extraction confidence level, choosing the Custom area option, and marking the table cell area.
Both Extract new table and Extract Rows from here options are using the same functionality and are enabling you to define new values.

Few more options are available in the table's header. You can use them for extracting a new table, highlighting the existing one, or just a row from it, or for deleting the entire table. The following table shows the available options and their descriptions:

Table 5. Table header options and their descriptions
	Description
	Replaces the existing table with the new selection. You need to mark all rows and columns. Keep in mind that the first row becomes the header of the table.
	Highlights the entire extracted table area.
	Highlights in the table the selected row.
	Deletes the existing table.

Define the table header while using the Extract new table option by enabling the Extract header function. Selecting the information from the document, or transforming one of the existing rows into the table's header can also define the header.

The following table shows the available functions of the Extract new table option and their descriptions.

Table 6. The **Extract new table** options and their descriptions
	Function
	Removes all lines visible in the selection.
	Removes only the selected line from the selection.
	Enables horizontal lines in the selection.
	Enables vertical lines in the selection.
	Enables you to adjust the line's direction using the mouse.
	Enables the selection, rearrangement, and removal of lines.

You can select Save new table to automatically confirm all fields or you can deny the operation by selecting Close, return to the table selection, and manually confirm each field.

Note: If you want to save a table with empty or no extracted fields, then those fields are automatically marked as missing.

Value Formatting and Language Setting

Number, date, and address fields allow you to review and correct formatted (parsed) parts of a specific value. The following table shows the editable parts for every field type.

Table 7. Editable parts for every field type
	Editable Formatted Parts
Number	Value (up to eight decimals)
Date	Day Month Year
Address	Address Line 1 Address Line 2 Address Line 3 City State / County / Province Country Zip Postal Code
Name	First Name Middle Name Last Name

When you extract or correct a value for a field of any of these types, the Validation Station tries to automatically parse the value into its formatted components.

The language setting displays the detected prevalent language within the document, as identified during the digitization process. This enables the Validation Station to parse numbers and dates more accurately, according to the language of the document. You can change the language setting by using the drop-down menu.

By doing so, when you manually extract or change a date or number value, the Validation Station will first try to format the selected string according to the selected language, and will fallback to English US if parsing is not successful as such. The formatting function only applies to the editable formatted parts of a value, not to the original string value.

To ensure the best automatic formatted value detection, we recommend you check the detected language and correct it if necessary.

Report Exceptions

You have the option to report a document as an exception. If this situation occurs, the Present validation Station throws an exception that should be caught by the RPA workflow and treated separately. The exception message displays the Reason for Exception filled in by the user.

Select Report Exception, then fill in the Reason field, and lastly, select Confirm, to save the exception.

Data confirmation and validation

You have the option to manually or automatically confirm all fields. For manual confirmation, you need to select the check box of each field. If a check box is not manually confirmed, then this process is automatically done when the Save button is clicked, and then the action is confirmed by clicking the Continue & save button.

The following table shows the options available on the bottom side of the Validation Station, for data confirmation and validation, and their functions.

Table 8. Data validation and confirmation options
	Function
	Saves the confirmed fields.
	Saves and closes the table selection area. The button is enabled only when the table field is active.
	Reports the document as being an exception.
	Enabled only when not all values are manually confirmed. By selecting it, all data is automatically confirmed and saved.
	Enabled when no change has been done to the table.
	Enabled after a change has been done to the table.
	Enabled after the user clicks Dismiss. Discards all changes done to the table.

Visit Validation station for more information about how to use and customize the Validation Station.

Document Understanding Integration

The Present Validation Station activity is part of the Document Understanding solutions. Visit the Document Understanding Guide for more information.

Document Understanding Activities

Present Validation Station