Subscribe

UiPath Document Understanding

UiPath Document Understanding

Search documents

Search options

The Search bar is both a text input field and a drop-down.
Search options can be inputted either by writing in the the Search bar or by selecting a filter from the drop-down. The current implementation for multiple words search returns results when the words are adjacent, excluding any punctuation in between them.
There are three main ways of initializing a search:

  1. Using the built-in filters that are available in the Search bar's drop-down. You can choose any of the following filters: train-set, validate-set, train-validate-set, evaluation-set, deleted, labelled, unlabelled.

📘

Note:

Please note that for Forms AI only the following built-in filters are available: deleted, labelled, unlabelled.

2550
  1. Using the import batch names. These are also available in the Search bar's drop-down. If added by hand, the format is batch:name, where name is replaced with the name you gave a batch at import time, e.g. batch:invoices1
  2. Using keywords. You have to enter the keyword(s) as free text in the Search bar. The search looks for the keyword(s) in a document's content or the document name.

You can choose to use one or more search options. Every additional option used casts a more specific searching net. Here are some search examples that start off by casting a wide net and slowly progress to a more refined search:

  • initiating a labelled search returns all the labelled docs in the dataset.
  • initiating a batch:invoices1 search returns all the docs that are part of the invoices1 batch.
  • initiating a labelled batch:invoices1 search returns all the labelled docs that are part of the invoices1 batch.
  • initiating a labelled batch:invoices1 vermont search returns all the labelled docs from the invoices1 batch which contain the inputted keyword, in this case vermont, either in the document name or document content.

The Search bar has a drop-down menu that, when opened, displays the following filters:

  • train-set - Indicates the number of documents to be used for training the model. Automated action.
  • validate-set - Indicates the number of documents to be used to validate the model after its training is complete. The split between the train and validate set is targeted to be 80%-20%. Automated action.
  • train-validate-set - Indicates the number of documents found in both the train-set and validate-set filters. Automated action.
  • evaluation-set - Indicates the number of documents that had the evaluation set checkbox checked during import and are intended to be used to evaluate the model in the stage of the training pipeline. More information can be found here. Manual action.
  • deleted - Specifies the number of deleted documents. More information can be found here.
  • labelled - Specifies the number of docs that have labels. A label is defined by at least one tagged/manually edited field per document.
  • unlabelled - Specifies the number of docs that don't have labels.
  • batch:name - Specifies the documents that have been comprised in the same import action.

The allocation of a document to either the train or validate sets is done by the application at import time.
Imported document end up in the evaluation set if the evaluation set checkbox is checked during import.

Predefined keywords

There are seven predefined keywords, namely:

  • train-validate-set
  • train-set
  • evaluation-set
  • validate-set
  • deleted
  • labelled
  • unlabelled

Besides these predefined keywords, you can also filter based on named batches depending on how many batches you imported into Document Manager:

  • batch:<batch_name_1>
  • batch:<batch_name_2>
  • batch:<batch_name_3>
  • etc.

Search/filter scenarios

  • You can search using one word of text: only the documents containing that specific word are displayed.
  • You can search using more than one word of text: only the documents containing those specific words, one after another, are displayed.

📘

Note:

The search is case-insensitive.

  • You can filter using a keyword: for instance, if you select labelled, only the labeled documents are displayed
  • You can filter using more than one keyword: for instance, if you select labelled and train-set, only the labeled documents marked as trained are displayed. The order in which the keywords appear does not matter.
  • You can also combine text with keywords: for instance, if you type payment and labelled, only the labeled documents containing this specific word are displayed.

🚧

Warning!

You cannot text search using keywords.

Inside document search

Inside document search allows you to search for instances of text solely in your current document.

The search bar can be found at the bottom left hand side of the screen.

Type the text you want to search and press Enter. All instances of text matching the search are highlighted in yellow and the document viewer is automatically scrolled to the first position.

To navigate from one instance of text to another, press either Enter or Page Down and Page Up.

Updated 5 days ago


Search documents


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.