UiPath Documentation
document-understanding
2023.4
false
  • Overview
    • Introduction
    • Capabilities overview
    • Language support
    • AI Center relation to Document Understanding
  • Document Understanding Process
    • Document Understanding™ Process: Studio Template
  • Quickstart tutorials
    • Extracting data from receipts
    • Invoices retrained with one additional field
    • Extracting data from Forms
  • Framework components
    • Taxonomy
      • Taxonomy Manager
      • Taxonomy overview
      • Taxonomy related activities
    • Digitization
      • Digitization overview
      • Digitization related activities
      • OCR engines
    • Document classification
      • Document classification overview
      • Configure Classifiers Wizard of Classify Document Scope
      • FlexiCapture Classifier
      • Intelligent Keyword Classifier
      • Keyword Based Classifier
      • Machine Learning Classifier
      • Document classification related activities
    • Document Classification Validation
      • Document classification validation overview
      • Classification Station
      • Document classification validation related activities
    • Document Classification Training
      • Configure Classifiers Wizard of Train Classifiers Scope
      • Document classification training overview
      • Document classification training related activities
      • Machine Learning Classifier Trainer
    • Data extraction
      • Configure Extractors Wizard of Data Extraction Scope
      • Data extraction overview
      • Data extraction related activities
      • FlexiCapture Extractor
      • Form Extractor
      • Intelligent Form Extractor
      • Machine Learning Extractor
      • RegEx Based Extractor
    • Data Extraction Validation
      • Data Extraction Validation overview
      • Data extraction validation related activities
      • Validation Station
    • Data Extraction Training
      • Configure Extractors Wizard of Train Extractors scope
      • Data Extraction Training overview
      • Data extraction training related activities
      • Machine Learning Extractor Trainer
    • Data consumption
      • Data consumption Overview
      • Data consumption related activities
  • ML packages
    • Overview
    • Document Understanding - ML package
    • DocumentClassifier - ML package
    • ML packages with OCR capabilities
    • Out-of-the-box Pre-trained ML Packages
      • 1040 - ML package
      • 4506T - ML package
      • 990 - ML Package - Preview
      • ACORD125 - ML package
      • ACORD126 - ML package
      • ACORD131 - ML package
      • ACORD140 - ML package
      • ACORD25 - ML package
      • Bank Statements - ML package
      • Bills Of Lading - ML package
      • Certificate of Incorporation - ML package
      • Certificate of Origin - ML package
      • Checks - ML package
      • Children Product Certificate - ML package
      • CMS 1500 - ML package
      • EU Declaration of Conformity - ML package
      • Financial Statements - ML package
      • FM1003 - ML package
      • I9 - ML package
      • ID Cards - ML package
      • Invoices - ML package
      • Invoices Australia - ML package
      • Invoices China - ML package
      • Invoices India - ML package
      • Invoices Japan - ML package
      • Invoices Shipping - ML package
      • Packing Lists - ML package
      • Passports - ML package
      • Payslips - ML package
      • Purchase Orders - ML package
      • Receipts - ML package
      • Remittance Advices - ML package
      • Utility Bills - ML package
      • Vehicle Titles - ML package
      • W2 - ML package
      • W9 - ML package
    • Other Out-of-the-box ML Packages
    • Public Endpoints
    • Supported languages
      • OCR
      • ML Packages
      • Other services
    • Hardware requirements
  • Pipelines
    • About pipelines
    • Training pipelines
    • Evaluation pipelines
    • Full pipelines
    • Fine-tuning
    • The Auto-Fine-tuning Loop (Public Preview)
  • Document Manager
    • Create document type
    • The user interface
    • Access Document Manager
    • Create and configure fields
    • Import documents
    • Label documents
    • Search documents
    • Export documents
    • Checkboxes and signatures
    • Dataset diagnostics
  • OCR services
    • OCR services
  • Deep Learning
    • Training high performing models
    • Deploying high performing models
  • Document Understanding deployed in Automation Suite
    • Install and use
    • First run experience
    • Deploy UiPathDocumentOCR
    • Deploy an out-of-the-box ML package
    • ML Packages Offline Installation
      • Offline bundles 2023.4.13+patch1
      • Offline bundles 2023.4.13
      • Offline bundles 2023.4.12
      • Offline bundles 2023.4.11
      • Offline bundles 2023.4.10+patch1
      • Offline bundles 2023.4.10
      • Offline bundles 2023.4.9
      • Offline bundles 2023.4.8
      • Offline bundles 2023.4.7
      • Offline bundles 2023.4.6
      • Offline Bundles 2023.4.5
      • Oflline bundles 2023.4.4
      • Offline Bundles 2023.4.3
      • Offline Bundles 2023.4.2
      • Offline Bundles 2023.4.1
      • Offline Bundles 2023.4.0
    • Use Document Manager
    • Use the Framework
  • Document Understanding deployed in AI Center standalone
    • Install and use
    • First run experience
    • Use Document Manager
    • Use the Framework
  • Licensing
    • API Key
    • Cloud and on-premises usage
    • Metering and charging logic (Flex Plan)
    • Legal information
  • Activities
    • Activities packages
      • UiPath.Abbyy.Activities
      • UiPath.AbbyyEmbedded.Activities
      • UiPath.DocumentProcessing.Contracts
      • UiPath.DocumentUnderstanding.ML.Activities
      • UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
      • UiPath.IntelligentOCR.Activities
      • UiPath.OCR.Activities
      • UiPath.OCR.Contracts
      • UiPath.OmniPage.Activities
      • UiPath.PDF.Activities
UiPath logo, featuring letters U and I in white

Document Understanding User Guide

Last updated Dec 4, 2025

RegEx Based Extractor

What is RegEx Based Extractor

The Regex Based Extractor is the perfect tool for simple use cases, in which, for certain fields, data is always found in a strict, predictable format and context. In other words, if you have a field for which you can define a Regular Expression that is consistently good when matched, then the Regex Based Extractor is a good choice.

The activity comes with a configuration wizard that assists you in defining the regular expressions for the fields you want to target for data extraction in this way.

The activity supports both simple fields as well as table field extraction.

It is recommended to look into other extraction methods, in case there is a high variability of the context and format of the expected values. In such cases, either a Form Extractor or a Machine Learning Extractor may be better suited.

This extractor does not have learning (training) capabilities and requires up-front configuration.

Special requirements

There are no special requirements for using the Regex Based Extractor.

How to configure

Activity configuration

The Regex Based Extractor has two major configurations to be considered:

  • the Configure Regular Expressions wizard - which allows you to define regular expressions for certain fields. This wizard also makes available the Regex Editor wizard, which assists you in building your regular expressions.
  • the UseVisualAlignment setting - which allows you to control whether the regular expressions configured for an extractor should be applied to the text output of the digitization component, or to a text version in which text lines are organized visually, and words are rearranged on lines based on their visual alignment.

Learn more

Learn more about Configure Regular Expression Wizard, by following this link.

  • What is RegEx Based Extractor
  • Special requirements
  • How to configure
  • Activity configuration
  • Learn more

Was this page helpful?

Connect

Need help? Support

Want to learn? UiPath Academy

Have questions? UiPath Forum

Stay updated