- Overview
- Getting started
- Activities
- Insights dashboards
- Document Understanding Process
- Quickstart tutorials
- Framework components
- ML packages
- Overview
- Document Understanding - ML package
- DocumentClassifier - ML package
- ML packages with OCR capabilities
- 1040 - ML package
- 1040 Schedule C - ML package
- 1040 Schedule D - ML package
- 1040 Schedule E - ML package
- 1040x - ML package
- 3949a - ML package
- 4506T - ML package
- 709 - ML package
- 941x - ML package
- 9465 - ML package
- ACORD131 - ML package
- ACORD140 - ML package
- ACORD25 - ML package
- Bank Statements - ML package
- Bills Of Lading - ML package
- Certificate of Incorporation - ML package
- Certificate of Origin - ML package
- Checks - ML package
- Children Product Certificate - ML package
- CMS 1500 - ML package
- EU Declaration of Conformity - ML package
- Financial Statements - ML package
- FM1003 - ML package
- I9 - ML package
- ID Cards - ML package
- Invoices - ML package
- Invoices Australia - ML package
- Invoices China - ML package
- Invoices Hebrew - ML package
- Invoices India - ML package
- Invoices Japan - ML package
- Invoices Shipping - ML package
- Packing Lists - ML package
- Payslips - ML package
- Passports - ML package
- Purchase Orders - ML package
- Receipts - ML Package
- Remittance Advices - ML package
- UB04 - ML package
- Utility Bills - ML package
- Vehicle Titles - ML package
- W2 - ML package
- W9 - ML package
- Other Out-of-the-box ML Packages
- Public endpoints
- Traffic limitations
- OCR Configuration
- Pipelines
- OCR services
- Supported languages
- Deep Learning
- Licensing
Document Understanding User Guide
Create and configure fields
Fields can be renamed. Just click the Edit field button and simply edit the name of the field at the top of the window.
If there are fields that you later decide you do not want to use for training an ML model, you can either delete them or you can always hide them using the Hidden checkbox in the Edit field window.
A line item Description or Unit Price on an invoice document would be examples of Column fields.
- Click in the table section at the top of the page to add a new Column field. The Create Column Field window is displayed.
- Fill in a unique name for the field in the Enter unique field name field. The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore
_
and dash-
. -
Click OK. The Edit Field window is displayed with the General tab open.
- From the Content Type drop-down, select the content type.
- Click the Hotkey field and press a key on your keyboard to automatically populate it.
-
Select the Split items checkbox if you want this field to be used as a delimiter between line items or rows in a table. Any line on which this field appears is considered to be a new line item or row in the table. Most commonly this is used on Line Amount fields on Invoice line items.
- Select the Hidden checkbox if you do not want this field to be part of exported datasets.
- Click on the Advanced tab.
- From the Scoring drop-down, select the measure used to determine accuracy when running evaluations of model predictions.
- Fill in the hex code of the desired field color on the Color field.
- Click Save to save your settings.
Click the Edit field button. The available options for column fields can be found in the table below.
Option |
Tab |
Description |
---|---|---|
Field name |
n/a |
The unique name for the field. The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore
_ and dash - .
|
Content type |
General |
The content type of a field:
|
Shortcut |
General |
The shortcut key for the field. One or two keys allowed. |
Split items |
General |
Select this checkbox if you want this field to be used as a delimiter between line items or rows in a table. Any line on which this field appears is considered to be a new line item or row in the table. Most commonly, this is used on Line Amount fields on Invoice line items. |
Hidden |
General |
Select this checkbox if you do not want this field to be part of exported datasets. |
Color |
Advanced |
The color for the field in hex format. If the value is not valid, a new one is generated. |
Scoring |
Advanced |
The measure used to determine accuracy when running evaluations of model predictions. It can only be configured for string content type. All other content types use an Exact Match scoring strategy. Options:
|
These are fields which appear only once on a given document. A line item Invoice Number or Total Amount on an invoice document would be examples of Column fields.
- Click on the right pane in the Regular Fields section. The Create Regular Field window is displayed.
- Fill in a unique name for the field in the Enter unique field name field. The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore
_
and dash-
. - Click OK. The Edit Field window is displayed with the General tab open.
- Select the content type from the Content Type drop-down.
- Click the Shortcut field and press a key on your keyboard to automatically populate it.
- Select the Multi line checkbox if the field to be checked against might span across multiple text lines, such as addresses or descriptions. If this option is not selected, only the first line is returned.
- Select the Multi-value checkbox for all the values detected in the document to be displayed as a list. You can either select the multi-line or the Multi-value checkbox.
- Select the Hidden checkbox if you do not want this field to be part of exported datasets.
- Click on the Advanced tab.
- Select the post processing mechanism in case the model predicts more than one instance of a field on a given page from the Post processing drop-down.
- From the Scoring drop-down, select the measure used to determine accuracy when running evaluations of model predictions.
- In the Color field, fill in the hex code of the desired field color.
- Click Save to save your settings.
Click the Edit field button. The available options for regular fields can be found in the table below.
Option |
Tab |
Description |
---|---|---|
Field name |
n/a |
The unique name for the field. The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore
_ and dash - .
|
Content type |
General |
The content type of a field:
|
Post processing |
Advanced |
The post-processing mechanism. If the model predicts more than one instance of a field on a given page, the model returns:
|
Shortcut |
General |
The shortcut key for the field. One or two keys allowed. |
Multi line |
General |
Select this checkbox for fields which may span across multiple text lines (addresses or descriptions), otherwise, only the first line is returned. |
Multi value | General | Select this checkbox for all the values detected in the document to be displayed as a list. You can either select the multi-line or the Multi-value checkbox. |
Hidden |
General |
Select this checkbox if you do not want this field to be part of exported datasets. |
Scoring |
Advanced |
The measure used to determine accuracy when running evaluations of model predictions. It can only be configured for string content type. All other content types use an Exact Match scoring strategy. Options:
|
Color |
Advanced |
The color for the field in hex format. If the value is not valid, a new one is generated. |
Data points which refer to a document as a whole. For instance, the Expense Type of a receipt (Food, Hotel, Airline, Transportation) or the Currency of an invoice (USD, EUR, JPY) would be examples of Classification fields.
- Click on the right pane in the Classification Fields section. The Create a new classification field window is displayed.
- Fill in a unique name for the field in the Enter unique field name field. The field does not accept uppercase letters. It can only contain lowercase letters, numbers, underscore
_
and dash-
. - Click OK. The Edit Field window is displayed.
- In the text area, fill in the list of classes and type the names as a comma separated list.
- Click Save to save your settings.
:
(option 1 : description 1).