activities
latest
false
UiPath logo, featuring letters U and I in white

Document Understanding Activities

Last updated Dec 5, 2024

Generative extractor - Good practices

Note:
  • For improved stability, the number of prompts is limited to maximum 50.
  • The response, extraction result, also called Completion, has a word limit of 700. This is limited to 700 words. This means that you can't extract more than 700 words from a single prompt. If your extraction requirements exceed this limit, you can divide the document into multiple pages, process them individually, and then merge the results afterwards.

Use precise language

Imagine you are asking four or five different persons the question you would like to ask the generative prompt. If you can imagine these people giving slightly different answers, then your language is too ambiguous and you need to rephrase to make it more precise.

Specify an output format

To make your question more specific, ask the extractor to return the answer in a standardized format. This reduces ambiguity, increases response accuracy, and simplifies downstream processing.

For example, if you are asking the generative prompt to get a date, specify how you want the date returned: return date in yyyy-mm-dd format. If you just need the year, specify: return the year, as a four digit number.
You can also use this approach for numbers. For example, you may specify: return numbers which appear in parentheses as negative or return number in ##,###.## format to standardize the decimal separator and thousands separator for easier downstream processing.

Provide expected options

A special case of formatting is when the answer is one of a known set of possible answers.

For example, on an application form you may ask: What is the applicant’s marital status? Possible answers: Married, Unmarried, Separated, Divorced, Widowed, Other.

This not only simplifies downstream processing but also increases response accuracy.

Step by step

To maximize accuracy, break down complex questions into simple steps. Instead of asking What is the termination date of this contract?, you should ask First find termination section of contract, then determine termination date, then return date in yyyy-mm-dd format.
There are many ways to break this down. You may even write your request as a small computer program, such as the following:
Execute the following program:

1: Find termination section or clause

2: Find termination date

3: Return termination date in yyyy-mm-dd format

4: StopExecute the following program:

1: Find termination section or clause

2: Find termination date

3: Return termination date in yyyy-mm-dd format

4: Stop

Defining what you want in a programming style, potentially even using JSON or XML syntax, forces the Generative model to use its programming skills, which increases accuracy when following instructions.

Avoid arithmetic or logic problems

Do not ask the extractor to perform sums, multiplication, subtraction, comparisons, or any other arithmetic operation, because it makes basic mistakes, besides being very slow and expensive compared to a simple robot workflow, which will never make a mistake, and is much faster and cheaper.

Do not ask it to perform complex if-then-else type logic, for the same reason as above. The robot workflow is much more accurate and efficient with this kind of operations.

Tables

The Generative Extractor currently does not support column fields. Although you may be able to extract smaller tables through regular questions and parse their output, please note that this is only a workaround and comes with restrictions. It is neither designed nor recommended for extracting generic, arbitrarily large tables.

Extracting data from tables is a challenge for the Generative extractor, because the Generative AI technology operates on linear strings of text and does not understand visual two-dimensional information in images. However, you can still extract data from tables, choosing from at least 2 different approaches, outlined in the following examples:
  • One approach is to ask the Generative extractor to return columns separately, and then assemble the rows yourself in a workflow. In this case, you might ask: Please return the Unit Prices on this invoice, as a list from top to bottom, as a list in the format [<UnitPrice1>, <UnitPrice2>,…]
  • Another approach is to ask it to return each row separately, as a JSON object. In this case, you might ask: Please return the line items of this invoice as an JSON array of JSON objects, each object in format: {"description”: <description>, “quantity”:<quantity>, “unit_price”:<unit price>, “amount”:<amount>}.

Confidence level

Generative AI models do not provide confidence levels for the predictions. However the goal is to detect errors, and confidence levels is just one possible way to achieve that goal – and not the best one. A much better and more reliable way to detect errors is to ask the same question in multiple different ways. The more different the question statement, the better. If all answers converge towards a common result, then the likelihood of an error is very low. If the answers disagree, then likelihood of error is high.

For instance, you may repeat the same question two, three, or even five times (depending on how crucial it is to avoid uncaught errors in your procedure), combining the aforementioned suggestions in varied combinations. If all the responses are consistent, human review may not be necessary. However, if any of the replies differ, manual review by a person in Action Center may be required.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.