document-understanding
latest
false
UiPath logo, featuring letters U and I in white

Document Understanding User Guide

Automation CloudAutomation Cloud Public SectorAutomation SuiteStandalone
Last updated Dec 12, 2024

Search documents

Search options

Three search capabilities are available in total, two are present in the management bar from the top of the page, and one is using the icon from the bottom-left side of the page.

The management bar search functionality consists of:

  1. Search using the built-in filters: filters the documents based on the batch/category options available from the dropdown menu.
    Attention:

    Selecting more options makes the search more restrictive. For example, selecting Batch import1 and Deleted is bringing up only the documents imported in the Batch import1 which are deleted.

    Take note of combinations that will always return an empty list: selecting Batch import1 and Batch import2 would never return a document since the selection is restrictive and no document can be in two batches at a time.

  2. Search in all documents in the dataset using keywords: this search input filters the information based on text input. You have to enter the keyword(s) as free text in the search field. The search looks for the keyword(s) in a document's content, or the document name. Multiple-word search returns results when the words are adjacent, excluding any punctuation in between them.
  3. Search inside the currently displayed document: allows you to search for instances of text solely in your current document. The search bar,docs image , can be found at the bottom left hand side of the screen.

Search using the built-in filters

Search the documents using the built-in filters that are available in the category/batch dropdown. You can choose any of the following filters: Training and validation set, Training set, Evaluation set, Validation set, Deleted, Labelled, Unlabelled.

Each filter displays in brackets the number of documents that meet the criteria.

There are seven predefined keywords, namely:

  • Training and validation set
  • Training set
  • Evaluation set
  • Validation set
  • Deleted
  • Labelled
  • Unlabelled
Note:

Please note that for Forms AI only the following built-in filters are available: Deleted, Labelled, Unlabelled.

Besides these predefined keywords, you can also filter based on named batches depending on how many batches you imported into Document Manager:

  • Batch <batch_name_1>
  • Batch <batch_name_2>
  • Batch <batch_name_3>
  • etc.

Search using keywords

You have to enter the keyword(s) as free text in the Search field. The search looks for the keyword(s) in a document's content, or the document name.

You can search using more than one word of text: only the documents containing those specific words, one after another, are displayed.

Note:

The search is case-insensitive.

You can filter using a keyword: for instance, if you select Labelled, only the labeled documents are displayed.

You can filter using more than one keyword: for instance, if you select Labelled and Training set, only the labeled documents marked as trained are displayed. The order in which the keywords appear does not matter.

Search inside the document

Initiate a search within the current document by clicking the icon from the bottom, left side of the screen, typing the text you want to search for, and pressing Enter.

All instances of text matching the search are highlighted in yellow and the document viewer is automatically scrolled to the first position. To navigate from one instance of text to another, press either Enter or Page Down and Page Up.

Initiate a search

The Search option has a dropdown menu that, when opened, displays the following filters:

  • Training set - Indicates the number of documents to be used for training the model. Automated action.
  • Validation set - Indicates the number of documents to be used to validate the model after its training is complete. The split between the train and validate set is targeted to be 80%-20%. Automated action.
  • Training and validation set - Indicates the number of documents found in both the train-set and validate-set filters. Automated action.
  • Evaluation set - Indicates the number of documents that had the evaluation set checkbox checked during import and are intended to be used to evaluate the model in the stage of the training pipeline. More information can be found here. Manual action.
  • Deleted - Specifies the number of deleted documents. More information can be found here.
  • Labelled- Specifies the number of docs that have labels. A label is defined by at least one tagged/manually edited field per document.
  • Unlabelled - Specifies the number of docs that don't have labels.
  • Batch name - Specifies the documents that have been comprised in the same import action.

The allocation of a document to either the train or validate sets is done by the application at import time.

Imported documents end up in the evaluation set if the evaluation set checkbox is marked during import.

  • Search options
  • Search using the built-in filters
  • Search using keywords
  • Search inside the document
  • Initiate a search

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.