- Overview
- Document Processing Contracts
- Release notes
- About the Document Processing Contracts
- Box Class
- IPersistedActivity interface
- PrettyBoxConverter Class
- IClassifierActivity Interface
- IClassifierCapabilitiesProvider Interface
- ClassifierDocumentType Class
- ClassifierResult Class
- ClassifierCodeActivity Class
- ClassifierNativeActivity Class
- ClassifierAsyncCodeActivity Class
- ClassifierDocumentTypeCapability Class
- ExtractorAsyncCodeActivity Class
- ExtractorCodeActivity Class
- ExtractorDocumentType Class
- ExtractorDocumentTypeCapabilities Class
- ExtractorFieldCapability Class
- ExtractorNativeActivity Class
- ExtractorResult Class
- ICapabilitiesProvider Interface
- IExtractorActivity Interface
- ExtractorPayload Class
- DocumentActionPriority Enum
- DocumentActionData Class
- DocumentActionStatus Enum
- DocumentActionType Enum
- DocumentClassificationActionData Class
- DocumentValidationActionData Class
- UserData Class
- Document Class
- DocumentSplittingResult Class
- DomExtensions Class
- Page Class
- PageSection Class
- Polygon Class
- PolygonConverter Class
- Metadata Class
- WordGroup Class
- Word Class
- ProcessingSource Enum
- ResultsTableCell Class
- ResultsTableValue Class
- ResultsTableColumnInfo Class
- ResultsTable Class
- Rotation Enum
- SectionType Enum
- WordGroupType Enum
- IDocumentTextProjection Interface
- ClassificationResult Class
- ExtractionResult Class
- ResultsDocument Class
- ResultsDocumentBounds Class
- ResultsDataPoint Class
- ResultsValue Class
- ResultsContentReference Class
- ResultsValueTokens Class
- ResultsDerivedField Class
- ResultsDataSource Enum
- ResultConstants Class
- SimpleFieldValue Class
- TableFieldValue Class
- DocumentGroup Class
- DocumentTaxonomy Class
- DocumentType Class
- Field Class
- FieldType Enum
- LanguageInfo Class
- MetadataEntry Class
- TextType Enum
- TypeField Class
- ITrackingActivity Interface
- ITrainableActivity Interface
- ITrainableClassifierActivity Interface
- ITrainableExtractorActivity Interface
- TrainableClassifierAsyncCodeActivity Class
- TrainableClassifierCodeActivity Class
- TrainableClassifierNativeActivity Class
- TrainableExtractorAsyncCodeActivity Class
- TrainableExtractorCodeActivity Class
- TrainableExtractorNativeActivity Class
- Document Understanding Digitizer
- Document Understanding ML
- Document Understanding OCR Local Server
- Document Understanding
- Release notes
- About the Document Understanding activity package
- Project compatibility
- Set PDF Password
- Merge PDFs
- Get PDF Page Count
- Extract PDF Text
- Extract PDF Images
- Extract PDF Page Range
- Extract Document Data
- Create Validation Task and Wait
- Wait for Validation Task and Resume
- Create Validation Task
- Classify Document
- Create Classification Validation Task
- Create Classification Validation Task and Wait
- Wait for Classification Validation Task and Resume
- Intelligent OCR
- Release notes
- About the IntelligentOCR activity package
- Project compatibility
- Configuring Authentication
- Load Taxonomy
- Digitize Document
- Classify Document Scope
- Keyword Based Classifier
- Document Understanding Project Classifier
- Intelligent Keyword Classifier
- Create Document Classification Action
- Wait For Document Classification Action And Resume
- Train Classifiers Scope
- Keyword Based Classifier Trainer
- Intelligent Keyword Classifier Trainer
- Data Extraction Scope
- Document Understanding Project Extractor
- RegEx Based Extractor
- Form Extractor
- Intelligent Form Extractor
- Present Validation Station
- Create Document Validation Action
- Wait For Document Validation Action And Resume
- Train Extractors Scope
- Export Extraction Results
- ML Services
- OCR
- OCR Contracts
- Release notes
- About the OCR Contracts
- Project compatibility
- IOCRActivity Interface
- OCRAsyncCodeActivity Class
- OCRCodeActivity Class
- OCRNativeActivity Class
- Character Class
- OCRResult Class
- Word Class
- FontStyles Enum
- OCRRotation Enum
- OCRCapabilities Class
- OCRScrapeBase Class
- OCRScrapeFactory Class
- ScrapeControlBase Class
- ScrapeEngineUsages Enum
- ScrapeEngineBase
- ScrapeEngineFactory Class
- ScrapeEngineProvider Class
- OmniPage
- PDF
- [Unlisted] Abbyy
- [Unlisted] Abbyy Embedded
Document Understanding Activities
Present Validation Station
UiPath.IntelligentOCR.Activities.ValidationStation.PresentValidationStation
Opens the Validation Station, which enables users to review and correct document classification and automatic data extraction results.
Properties panel
Common
- DisplayName - The display name of the activity.
Input
- AutomaticExtractionResults
- The automatically generated extraction results, stored in an
ExtractionResult
proprietary variable. If a variable is added to this field, the Validation Station displays the results of the automatic extraction, enabling you to review and modify them. If left empty, the Validation Station contains no automatically extracted data. This field supports onlyExtractionResult
variables. - DocumentObjectModel - The
Document Object Model you want to use to validate the document against. This
model is stored in a
Document
variable and can be retrieved from the Digitize Document activity. Visit Digitize Document to learn how to achieve this. This field supports onlyDocument
variables. - DocumentPath - The path to
the document you want to validate. This field supports only strings and String
variables.
Note: The supported file types for this property field are:
.png
,.gif
,.jpe
,.jpg
,.jpeg
,.tiff
,.tif
,.bmp
, and.pdf
. - DocumentText - The text of
the document itself, stored in a String variable. This value can be retrieved
from the Digitize Document activity. Visit Digitize Document to learn
how to achieve this. This field supports only strings and
String
variables. - Taxonomy - The Taxonomy
against which the document is to be processed, stored in a
DocumentTaxonomy
variable. This field supports onlyDocumentTaxonomy
variables.
Misc
- FieldsValidationConfidence % - Set the upper limit confidence score to be used when rendering the Validation Station.
- Private - If selected, the values of variables and arguments are no longer logged at Verbose level.
- ShowOnlyRelevantPageRange - If selected, only the page range mentioned in the extraction results is shown and the pages that are outside the range are hidden.
Output
- ValidatedExtractionResults
- The extraction results of the human validation process, stored in an
ExtractionResult
variable.Important: In case you use an Intel Xe GPU and Validation Station is not displayed properly, we recommend updating the graphics driver to the latest version. Visit Intel support for more information.
The Validation Station enables you to review and correct automatically extracted data from files, or manually process files for data extraction. The Validation Station, once opened, presents all extracted information along with the file being processed.
The fields that are visible in the Validation Station are the ones defined in the Taxonomy used in your workflow.
The right area of the Validation Station contains an interactive version of the original document, in which text or document sections can be selected, and words can be clicked based on the output of the digitization process. This area also contains options for zooming in and out, selecting and rotating pages, searching through the document, or switching to text view.
The following table shows the options in the right part of the Validation Station screen, and what actions you can perform by using them.
Option |
Description |
---|---|
|
Displays all the available keyboard shortcuts supported by the
Validation Station.
|
|
Toggles between the text view and image view of the document.
|
- Text Note: Active only when the Text only
view option is active
|
Sets the selection mode while in text view:
|
|
Sets the selection mode while in image view:
|
|
Rotates the current page clockwise.
Note: The
Rotate option is available only in Image
view.
|
|
Initiates a search between results in the document used by the Validation Station. |
|
Resets the zoom level on the document. This option is enabled only if the document was previously zoomed in or out. |
|
Zooms in on the document. |
|
Zooms out on the document. Note:
To zoom in or out, you can also use the CTRL + scroll mouse wheel combination: CTRL+scroll up to view a specific section of the document; CTRL+scroll down to view a larger section of the document. |
This section describes how to use the available options for interacting with documents in the Classification Station.
- Ensure that Image view is selected.
- Select Tokens and then select Custom area.
- Select the desired area in your document.
- Go to the document's more options
on the left side, and choose if you want to Change reference or Remove
reference.
Figure 3. Animated image showing how to perform selection in image view
Similarly to how you select a part of the document using the custom area option within the image view, you do the same within the text view. The only difference is that you ensure that Text view is selected.
You can use keyboard shortcuts to optimize the interaction with the Validation Station. We encourage you to use them as much as possible. You can view them in the Keyboard Shortcuts pop-up.
To start using keyboard shortcuts, go to More options, select Keyboard shortcuts, and then select Toggle keyboard shortcuts.
The following table shows all the available keyboard shortcuts and their corresponding descriptions.
Description | |
---|---|
n |
Moves to the next field |
p |
Moves to the previous field |
f v |
Marks a value as validated |
f c |
Changes the extracted value |
f z |
Reverts to the previous value |
f a |
Adds an additional value |
f s |
Toggles between suggestions |
ESC |
Exits edit mode (for Fields and Tables)
|
DEL |
|
CTRL SHIFT ENTER |
Save unconfirmed fields |
CTRL SHIFT S |
Save data as draft |
Alt p |
Toggle PDF Viewer focus |
d + |
Zooms in |
d - |
Zooms out |
d 0 |
Resets zoom |
d r |
Rotates the page clockwise |
d t |
Toggles the text mode |
/ |
Initiates a search |
d s |
Changes selection mode |
d a |
Clears the drawn anchor selection |
d h |
Toggles the extracted tokens |
s↑ |
Move selected line right |
s ← |
Move selected line left |
s ↑ |
Move selected line up |
s ↓ |
Move selected line down |
s d |
Duplicate the selected line |
s v |
Vertical line |
s f |
Horizontal line |
s a |
Auto detect by mouse movement |
s t |
Hand tool - move and delete lines |
? |
This screen |
! |
Report document as exception |
CTRL ENTER |
Save data |
CTRL DEL |
Discard all current changes |
Right arrow
→ |
Moves to the right cell |
Left arrow
← |
Moves to the left cell |
Upward arrow
↑ |
Moves to the top cell |
Downward arrow
↓ |
Moves to the bottom cell |
t v |
Marks a cell as validated |
t c |
Changes the extracted cell |
t z |
Reverts to the previous cell value |
t d |
Discards changes in tables |
t DEL |
Removes the selected cell |
t ESC |
Close the table editor |
1 2 3 4 5 6 7 8 9 q w e r y a g h j k l z x c v m @ # $ % ^ & *** ( ) [ ] { |
|
Select More Options in the right area of the Validation Station, and then select Hide extracted tokens to have a clean view panel and hide the highlights of the extracted tokens.
The left area displays the document type you have selected for the current validation and enables you to select the state of each element and link it to its corresponding word or area in the document.
The confidence level of the extracted information can be displayed by OCR or Extraction.
The OCR Confidence level is given by the OCR engine used for extraction in the workflow. If the used OCR doesn't report any confidence levels, then N/A is displayed instead of percentages.
The Extraction Confidence level is given by the extractor used in the workflow.
The confidence score should be used only for guidance purposes. You can increase the confidence score by manually validating the data.
Another way of visualizing confidence levels is by filtering them depending on a threshold set by you. To do this, select Filter fields using the selected confidence level, and then adjust the confidence level based on which you want to filter.
The OCR confidence level changes individually, for each field, if you alter the reference of a certain field.
You can use the field shortcuts to assign values to a field or to toggle between fields. Once a value is assigned to a field, it is highlighted by the color of the selected field.
For the assigned value, there is a document crop displayed in the table field. This helps with better locating the area from which the value was extracted and it also serves as a means of double-checking the value by comparing it with the document crop.
The Document Type field is a special field that you can act upon in the following scenarios:
- If the extraction results contain a document type, and that document type is correct, then no action is required.
- If the extraction results contain a document type, and that document type is incorrect, then you have to select the correct one and provide evidence for it from within the document.
- If no extraction result is provided and only one document type exists in the taxonomy, then that document type is pre-selected but needs evidencing.
- If no extraction result is provided and there are multiple document types in the taxonomy, then you have to manually select the desired document type and provide evidence for it.
Automatically extracted fields have a confidence level percentage that is also color-coded, meant to help you detect fields that need assistance.
There are four levels of confidence:
- Below 50%, color coded in red.
- Between 50% and 85%, color coded in yellow.
- Between 86% and 99%, color coded in light green.
- 100%, color coded in green.
To increase the confidence level, you can validate the information by manually selecting it. After you manually select a part of the document, select Options for an extracted field, and then select Change extracted value.
Figure 8. The action of manually changing the extracted field value
All fields that contain information have an Options dropdown menu that can be accessed by selecting it. A drop-down list becomes visible, displaying multiple editing options.
The Options menu includes the following options:
- Change extracted value - Changes the automatically extracted value with a manually selected one. This field is active only when one or multiple values are selected from the document and are different from the original value.
- Revert to previous value - Resets the field's value to its last state. This option is active only when a value was previously altered or deleted.
- Mark as missing - Marks a field as missing if the information is not available in the document.
Selection Modes
There are several ways of selecting text while using the Validation Station wizard. Using them allows you to quickly navigate through the entire document and easily select the desired words for validating a field.
Here is a list of all the available selection options:
- Select one word - Select the desired word.
- Select consecutive words - Select the first word, then SHIFT+select the last word from range.
- Select multiple disparate words - Select the first word, then CTRL+select the rest of the desired words.
- Combine multiple selections - Select the first word, then SHIFT+select the last word from a range for the first selection, then hold CTRL+select and SHIFT+select to add another range, until you've completed your selections.
- Area selection - Make a
selection and choose the selection type:
- Tokens - Selects all words in the selected area.
- Custom area - Captures only the area and not the words in it.
- Choose after selection - Selects the entire area, with separate words, leaving you to decide the type of selection.
Other Options
- Notes - This is only
displayed if Validator notes for that certain field were enabled in
Taxonomy Manager. Depending on how it was configured, it can be the
following:
- A text field where you can add notes related to that field, such as why a certain value was chosen or if any extra checks should be performed.
- A text that cannot be edited.
- Several options in the form of radio buttons from which you can select one, depending on the situation.
Tip: Check the ExtractionResult Class page from the UiPath.DocumentProcessing.Contracts section for more information on the two methods related to validator notes,GetFieldValidatorNotes(<fieldId>)
andSetFieldValidatorNotes(<fieldId>, <validatorNote>)
.Note: To check which releases will include validator notes in Action Center, refer to the release notes for version 6.19.0. - Edit the field's value - Changes the content of a field by selecting that field, selecting the value, and adding the desired input.
- The Undo option - Reverts the field to its prior state. Selecting this one time takes you one step back, meaning that if you had several changes on that field, multiple clicks might be required for returning to a certain value. This field is active only when a value was previously modified or deleted.
- The Add option - Adds a value to the field by using the Custom area or Tokens selection. The option becomes available when a selection is made in the document and differs from the one in the field. The selection can be made for multi-value fields at all times, and for single-value fields only if no value is present for that field. First select the part of the document and then the Add option.
- The Validation option - Confirms the information included into the field. Once confirmed, a
Validated tag is added to the field.
Once a field is manually validated, you can still check the original value of that field by selecting the Extraction confidence level. This functionality is available only for Extraction confidence level.
Figure 9. Selecting the Extraction confidence level
The interface of the Validation Station is interactive, meaning that when a field is selected on the left side, the right side moves the focus on it by highlighting it.
- The Add Extra option - Enables you to select and add additional values from the document to a specific field.
- The Add option - Enables you to add a value to a field without requiring reference from the document.
Table Fields - Cell Level Processing
The extraction confidence level is available for each extracted cell, for both OCR and Extractor used in the workflow. Toggle between them from the upper left side of the Validation Station.
The following table shows the options available for a table field, and their descriptions.
Description | |
---|---|
|
Indicates the table's area in the document. |
|
Creates a new table and opens the table extraction tools. |
|
Opens a drop-down list with the following options:
|
|
When selected, confirms the data accuracy. |
|
Enabled when the field's value is missing. |
|
Enabled when the initial extracted value had been altered or deleted. When selected, it restores the previous value. |
|
Marks a field as missing. |
All table fields have a dropdown menu with the following options:
- Change extracted value - Selects a new value from the document and replaces the initial one. This field becomes available only when the newly selected value is different from the original selection.
- Revert to previous value - Replaces the actual value with the previous one. This field is active only when altering the initial value.
- Mark as missing - Marks a field as missing if the info is not available in the document.
The following table shows the dropdown menu options available at the end of each row.
Description | |
---|---|
|
Transforms the selected row in the table's header. The row cannot be seen in the table's body anymore, but the information can be visualized any time a header's field is selected. |
|
Highlights the selected row. |
|
Extracts a new row and places it below the selected one. It enables the custom selection tool and offers you the possibility to manually select the new row. Once the area of the new rows is selected, you must define each column. Do this by using the available options presented in the table below. |
|
Inserts a new empty row above the selected one. The row is automatically added to the table, except that all fields are marked as Not extracted. You have to manually select the value from the document and add it by using the Add extracted value option. |
|
Applies the same principle as for the Insert row above option, the only difference being that the row is inserted below and not above the actual selection. |
|
Deletes the row. |
Once a field from the table is edited or reviewed, the confirmation box changes its appearance. To confirm the data you must check the box.
Table Fields - Table Level Processing
A table can be manually selected and defined, straight from the Validation Station wizard. If no table is selected, or if you are not happy with the automatic selection, then you can use the options available on the dropdown menu found at the end of the first row.
- If the Value of a table cell is not extracted, you can manually add a value into that table cell, by going into the Selecting the Extraction confidence level, choosing the Custom area option, and marking the table cell area.
- Both Extract new table and Extract Rows from here options are using the same functionality and are enabling you to define new values.
Few more options are available in the table's header. You can use them for extracting a new table, highlighting the existing one, or just a row from it, or for deleting the entire table. The following table shows the available options and their descriptions:
Description | |
---|---|
|
Replaces the existing table with the new selection. You need to mark all rows and columns. Keep in mind that the first row becomes the header of the table. |
|
Highlights the entire extracted table area. |
|
Highlights in the table the selected row. |
|
Deletes the existing table. |
Define the table header while using the Extract new table option by enabling the Extract header function. Selecting the information from the document, or transforming one of the existing rows into the table's header can also define the header.
The following table shows the available functions of the Extract new table option and their descriptions.
Function | |
---|---|
|
Removes all lines visible in the selection. |
|
Removes only the selected line from the selection. |
|
Enables horizontal lines in the selection. |
|
Enables vertical lines in the selection. |
|
Enables you to adjust the line's direction using the mouse. |
|
Enables the selection, rearrangement, and removal of lines. |
You can select Save new table to automatically confirm all fields or you can deny the operation by selecting Close, return to the table selection, and manually confirm each field.
Value Formatting and Language Setting
Number, date, and address fields allow you to review and correct formatted (parsed) parts of a specific value. The following table shows the editable parts for every field type.
Editable Formatted Parts | |
---|---|
Number | Value (up to eight decimals) |
Date |
|
Address |
|
Name |
|
When you extract or correct a value for a field of any of these types, the Validation Station tries to automatically parse the value into its formatted components.
The language setting displays the detected prevalent language within the document, as identified during the digitization process. This enables the Validation Station to parse numbers and dates more accurately, according to the language of the document. You can change the language setting by using the drop-down menu.
By doing so, when you manually extract or change a date or number value, the Validation Station will first try to format the selected string according to the selected language, and will fallback to English US if parsing is not successful as such. The formatting function only applies to the editable formatted parts of a value, not to the original string value.
To ensure the best automatic formatted value detection, we recommend you check the detected language and correct it if necessary.
Report Exceptions
You have the option to report a document as an exception. If this situation occurs, the Present validation Station throws an exception that should be caught by the RPA workflow and treated separately. The exception message displays the Reason for Exception filled in by the user.
Select Report Exception, then fill in the Reason field, and lastly, select Confirm, to save the exception.
Data confirmation and validation
You have the option to manually or automatically confirm all fields. For manual confirmation, you need to select the check box of each field. If a check box is not manually confirmed, then this process is automatically done when the Save button is clicked, and then the action is confirmed by clicking the Continue & save button.
The following table shows the options available on the bottom side of the Validation Station, for data confirmation and validation, and their functions.
Function | |
---|---|
|
Saves the confirmed fields. |
|
Saves and closes the table selection area. The button is enabled only when the table field is active. |
|
Reports the document as being an exception. |
|
Enabled only when not all values are manually confirmed. By selecting it, all data is automatically confirmed and saved. |
|
Enabled when no change has been done to the table. |
|
Enabled after a change has been done to the table. |
|
Enabled after the user clicks Dismiss. Discards all changes done to the table. |
Visit Validation station for more information about how to use and customize the Validation Station.
The Present Validation Station activity is part of the Document Understanding solutions. Visit the Document Understanding Guide for more information.