Document Understanding - About Pipelines

document-understanding

2021.10

false

Document Understanding User Guide

About Pipelines

Document Understanding ML Packages can run all three types of pipelines:

Once completed, a pipeline run has associated outputs and logs. To see this information, in the Pipelines tab from the left sidebar, click a pipeline to open the Pipeline view which consists of:

the Pipeline details such as type, ML Package name and version, dataset, GPU usage, parameters, and execution time
the Outputs pane; this always includes a _results.json file containing a summary of the Pipeline details
the Logs page; the logs can also be obtained in the ML Logs tab from the left sidebar

Training pipelines or Full pipelines can also be used to:

Fine-tune ML models with data from Validation Station
Auto-Fine-tuning an ML model

Terms and Definitions

Training: Training a model from scratch, i.e. using the DocumentUnderstanding ML Package in AI Center.

Retraining: Training using a pre-trained base-model, i.e. using one of the other document extraction ML packages in AI Center such as Invoices,Receipts,Purchase Orders, etc.

Auto-retraining: This is the name of an environment variable which can be set when creating a Pipeline in AI Center which enables the pipeline to automatically use the most recent exported dataset for training. This variable is independent of whether that dataset includes data from Validation Station or not.

Fine-tuning: Training or retraining a model using a dataset which includes data coming from Validation Station.

Auto-Fine-tuning: Using the Auto-retraining environment variable feature to automatically train a model using data fed in from Validation Station using the Scheduled Export feature of Data Manager.

On this page

Terms and Definitions

Was this page helpful?

PREVIOUSInstall and Use Intelligent Form Extractor

NEXTTraining Pipelines