UiPath Activities

The UiPath Activities Guide

v20.1 Data Extraction Scope -Preview

Release Date: 20th Jan 2020, Version: v20.1


Provides a scope for extractor activities, enabling you to configure them according to the document types defined in your taxonomy. The output of the activity is stored in an ExtractionResult variable, containing all automatically extracted data, and can be used as input for the Export Extraction Results activity. This activity also features a Configure Extractors wizard, which lets you specify exactly what fields from the document types defined in the taxonomy you want to extract.



  • DisplayName - The display name of the activity.


  • ClassificationResults - The results of running a classifier activity on the specified document, stored in a ClassificationResult object. This field is optional if you specify a DocumentTypeId instead. This field supports only ClassificationResult variables.
  • DocumentObjectModel - The Document Object Model you want to use to validate the document against. This model is stored in a Document variable and can be retrieved from the Digitize Document activity. Please see the documentation of the activity for more information on how to do this. This field supports only Document variables.
  • DocumentPath - The path to the document you want to validate. This field supports only strings and String variables.


The supported file types for this property field are .png, .gif, .jpe, .jpg, .jpeg, .tiff, .tif, .bmp, and .pdf.

  • DocumentText - The text of the document itself, stored in a String variable. This value can be retrieved from the Digitize Document activity. Please see the documentation of the activity for more information on how to do this. This field supports only strings and String variables.
  • DocumentTypeID - The Document Type ID, as found in the Taxonomy Manager. This field is optional if you specify a file in the ClassificationResults field. This field supports only strings and String variables.

    • FormatValuesIfPossible - Specifies that if a value has derived parts reported, then it isn't override by the data extraction scope, but if it doesn't have derived parts,
      then the data extraction scope tries to compute it. If the option is set to False then the values are not formatted.
  • Taxonomy - The Taxonomy against which the document is to be processed, stored in a DocumentTaxonomy variable. This object can be obtained by using a Load Taxonomy activity. This field supports only DocumentTaxonomy variables.


  • Private - If selected, the values of variables and arguments are no longer logged at Verbose level.


  • ExtractionResults - The extraction results of the data extraction process, stored in an ExtractionResult variable.

The Configure Extractors Wizard

The Configure Extractors wizard can be opened from the body of the activity, by clicking on the Configure Extractors button. The wizard button becomes available after dragging at least one extractor activity into the body of the Data Extraction Scope activity. This wizard displays all the document types defined in the taxonomy and their respective fields and enables you to choose which extractor you want to use for each.

Each document type can be expanded and its fields can be viewed in the wizard and selected for extraction.

The Minimum Confidence field can be configured with a value between 0 and 100 and represents the confidence threshold above which extracted data is taken into account. If a result for a selected field has a confidence level below the confidence threshold, it is not reported in the final result.

The check boxes next to each field in any column, if selected, cause the extractor to be asked for a value for the specified field. If cleared, the field is ignored when extracting data.

The text fields or drop down options next to each document field enable you to map fields defined in your Taxonomy with the fields defined in the extractor's internal taxonomy, if any.

The number of columns in the wizard varies according to the number of the extractors present in the scope activity. The name of each column is given by the display name of each extractor activity.

If multiple extractors are used in the activity, the order of the extractors in the scope defines their priority. For example, in the image above, if Extractor 1 returns an acceptable value (which is above the Minimum Confidence level), the results of Extractor 2 and Extractor 3 are ignored. If Extractor 1 and Extractor 2 return values below the Minimum Confidence level, or return nothing at all, the results from Extractor 3 are taken into account.

Using the Data Extraction Scope with an Extractor declaring an Internal Taxonomy

If an extractor used within the Data Extraction Scope uses an internal taxonomy and is capable of declaring it, when added to the workflow, the Data Extraction Scope queries for its capabilities.

In such cases, the Configure Extractors wizard displays, for each field for that extractor, drop-down lists containing all the document types and fields the extractor returns as capabilities. The configuration requires the selection of a certain field declared by the extractor from the available options.

In case the internal capabilities of an extractor change, you can update the list of capabilities by opening the Configure Extractors wizard and clicking on the configuration icon that appears under that extractor. This action re-triggers the capability of reading functionality.

One extractor declaring its internal taxonomy is the Machine Learning Extractor. This extractor is retrieving all available capabilities and offers you the possibility to easily customize and configure all fields targeted for data extraction.
Click here for learning how to use the Machine Learning Extractor.

Updated 2 months ago

v20.1 Data Extraction Scope -Preview

Release Date: 20th Jan 2020, Version: v20.1

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.