Document Understanding - RegEx Based Extractor

document-understanding

2022.4

false

Document Understanding User Guide

RegEx Based Extractor

What Is RegEx Based Extractor

The Regex Based Extractor is the perfect tool for simple use cases, in which, for certain fields, data is always found in a strict, predictable format and context. In other words, if you have a field for which you can define a Regular Expression that is consistently good when matched, then the Regex Based Extractor is a good choice.

The activity comes with a configuration wizard that assists you in defining the regular expressions for the fields you want to target for data extraction in this way.

The activity supports both simple fields as well as table field extraction.

It is recommended to look into other extraction methods, in case there is a high variability of the context and format of the expected values. In such cases, either a Form Extractor or a Machine Learning Extractor may be better suited.

This extractor does not have learning (training) capabilities and requires up-front configuration.

How to Configure

Learn More

Learn more about Configure Regular Expression Wizard, by following this link.

On this page

What Is RegEx Based Extractor
How to Configure
Learn More

Was this page helpful?

PREVIOUSConfigure Extractors Wizard of Data Extraction Scope

NEXTSpecial Requirements