- Overview
- Document Processing Contracts
- About the Document Processing Contracts
- Box Class
- IPersistedActivity Interface
- PrettyBoxConverter Class
- IClassifierActivity Interface
- IClassifierCapabilitiesProvider Interface
- ClassifierDocumentType Class
- ClassifierResult Class
- ClassifierCodeActivity Class
- ClassifierNativeActivity Class
- ClassifierAsyncCodeActivity Class
- ClassifierDocumentTypeCapability Class
- ExtractorAsyncCodeActivity Class
- ExtractorCodeActivity Class
- ExtractorDocumentType Class
- ExtractorDocumentTypeCapabilities Class
- ExtractorFieldCapability Class
- ExtractorNativeActivity Class
- ExtractorResult Class
- ICapabilitiesProvider Interface
- IExtractorActivity Interface
- ExtractorPayload Class
- DocumentActionPriority Enum
- DocumentActionData Class
- DocumentActionStatus Enum
- DocumentActionType Enum
- DocumentClassificationActionData Class
- DocumentValidationActionData Class
- UserData Class
- Document Class
- DocumentSplittingResult Class
- DomExtensions Class
- Page Class
- PageSection Class
- Polygon Class
- PolygonConverter Class
- Metadata Class
- WordGroup Class
- Word Class
- ProcessingSource Enum
- ResultsTableCell Class
- ResultsTableValue Class
- ResultsTableColumnInfo Class
- ResultsTable Class
- Rotation Enum
- SectionType Enum
- WordGroupType Enum
- IDocumentTextProjection Interface
- ClassificationResult Class
- ExtractionResult Class
- ResultsDocument Class
- ResultsDocumentBounds Class
- ResultsDataPoint Class
- ResultsValue Class
- ResultsContentReference Class
- ResultsValueTokens Class
- ResultsDerivedField Class
- ResultsDataSource Enum
- ResultConstants Class
- SimpleFieldValue Class
- TableFieldValue Class
- DocumentGroup Class
- DocumentTaxonomy Class
- DocumentType Class
- Field Class
- FieldType Enum
- LanguageInfo Class
- MetadataEntry Class
- TextType Enum
- TypeField Class
- ITrackingActivity Interface
- ITrainableActivity Interface
- ITrainableClassifierActivity Interface
- ITrainableExtractorActivity Interface
- TrainableClassifierAsyncCodeActivity Class
- TrainableClassifierCodeActivity Class
- TrainableClassifierNativeActivity Class
- TrainableExtractorAsyncCodeActivity Class
- TrainableExtractorCodeActivity Class
- TrainableExtractorNativeActivity Class
- Document Understanding Digitizer
- Document Understanding ML
- Release Notes
- About the Document Understanding ML Activities Package
- Project Compatibility
- Preview Releases
- Generative Extractor - good practices
- Generative Classifier - Good Practices
- Document Understanding OCR Local Server
- Document Understanding Process - Studio Template
- Document Understanding Activities
- About the Document Understanding Package
- Project Compatibility
- Set PDF Password
- Merge PDFs
- Get PDF Page Count
- Extract PDF Text
- Extract PDF Images
- Extract PDF Page Range
- Extract Document Data
- Create Validation Task and Wait
- Wait for Validation Task and Resume
- Create Validation Task
- Classify Document
- Create Classification Validation Task
- Create Classification Validation Task and Wait
- Wait for Classification Validation Task and Resume
- Intelligent OCR
- About the IntelligentOCR Activities Package
- Project Compatibility
- Load Taxonomy
- Digitize Document
- Classify Document Scope
- Keyword Based Classifier
- Intelligent Keyword Classifier
- Present Classification Station
- Create Document Classification Action
- Wait for Document Classification Action and Resume
- Train Classifiers Scope
- Keyword Based Classifier Trainer
- Intelligent Keyword Classifier Trainer
- Data Extraction Scope
- RegEx Based Extractor
- Form Extractor
- Intelligent Form Extractor
- Present Validation Station
- Create Document Validation Action
- Wait for Document Validation Action and Resume
- Train Extractors Scope
- Export Extraction Results
- ML Services
- OCR
- OCR Contracts
- Release Notes
- About the OCR Contracts
- Project Compatibility
- IOCRActivity Interface
- OCRAsyncCodeActivity Class
- OCRCodeActivity Class
- OCRNativeActivity Class
- Character Class
- OCRResult Class
- Word Class
- FontStyles Enum
- OCRRotation Enum
- OCRCapabilities Class
- OCRScrapeBase Class
- OCRScrapeFactory Class
- ScrapeControlBase Class
- ScrapeEngineUsages Enum
- ScrapeEngineBase
- ScrapeEngineFactory Class
- ScrapeEngineProvider Class
- OmniPage
- PDF
- [Unlisted] Abbyy
- [Unlisted] Abbyy Embedded
Generative Extractor - good practices
Imagine you are asking four or five different persons the question you would like to ask the generative prompt. If you can imagine these people giving slightly different answers, then your language is too ambiguous and you need to rephrase to make it more precise.
To make your question more specific, ask the extractor to return the answer in a standardized format. This reduces ambiguity, increases response accuracy, and simplifies downstream processing.
return date in yyyy-mm-dd
format
. If you just need the year, specify: return the year, as a four
digit number
.
return numbers which appear in parentheses as negative
or
return number in ##,###.## format
to standardize the decimal separator and
thousands separator for easier downstream processing.
A special case of formatting is when the answer is one of a known set of possible answers.
What
is the applicant’s marital status? Possible answers: Married, Unmarried, Separated,
Divorced, Widowed, Other.
This not only simplifies downstream processing but also increases response accuracy.
What is the termination date of this contract?
, you should ask
First find termination section of contract, then determine termination date, then
return date in yyyy-mm-dd format.
Execute the following program:
1: Find termination section or clause
2: Find termination date
3: Return termination date in yyyy-mm-dd format
4: Stop
Execute the following program:
1: Find termination section or clause
2: Find termination date
3: Return termination date in yyyy-mm-dd format
4: Stop
Defining what you want in a programming style, potentially even using JSON or XML syntax, forces the Generative model to use its programming skills, which increases accuracy when following instructions.
Do not ask the extractor to perform sums, multiplication, subtraction, comparisons, or any other arithmetic operation, because it makes basic mistakes, besides being very slow and expensive compared to a simple robot workflow, which will never make a mistake, and is much faster and cheaper.
Do not ask it to perform complex if-then-else type logic, for the same reason as above. The robot workflow is much more accurate and efficient with this kind of operations.
Extracting data from tables is a challenge for the Generative Extractor. The Generative AI technology operates on linear strings of text and does not understand visual two-dimensional information in images. It cannot extract table fields as defined in the Taxonomy Manager, but it can extract text and tables from documents.
- Ask the Generative Extractor to return
columns separately, and then assemble the rows yourself in a workflow. You might ask:
Please return the Unit Prices on this invoice, as a list from top to bottom, as a list in the format [<UnitPrice1>, <UnitPrice2>,…]
- Ask it to return each row separately,
as a JSON object. You might ask:
Please return the line items of this invoice as an JSON array of JSON objects, each object in format: {"description”: <description>, “quantity”:<quantity>, “unit_price”:<unit price>, “amount”:<amount>}
.
Generative AI models do not provide confidence levels for the predictions. However the goal is to detect errors, and confidence levels is just one possible way to achieve that goal – and not the best one. A much better and more reliable way to detect errors is to ask the same question in multiple different ways. The more different the question statement, the better. If all answers converge towards a common result, then the likelihood of an error is very low. If the answers disagree, then likelihood of error is high.
For the best results, we recommend asking the same question 5 times, combining the above recommendations in different ways. If all 5 responses are identical, then human review may not be needed. If one answer is different, then there may still be a high probability that the other 4 answers are correct. However, if 2 or more answers are different, then human manual review in Action Center is required.