# Full pipelines

> A full pipeline runs a training pipeline and an evaluation pipeline together.

A full pipeline runs a training pipeline and an evaluation pipeline together.

:::important
**Minimal dataset size** For successfully running a Training pipeline, we strongly recommend at least 25 documents and at least 10 samples from each labeled field in your dataset. Otherwise, the pipeline throws the following error: `Dataset Creation Failed`. **

Training on GPU vs CPU**
* For larger datasets, you need to train using GPU. Moreover, using a GPU (AI Robot Pro) for training is at least 10 times faster
than using a CPU (AI Robot).
* Training on CPU is only supported for datasets up to 5000 pages in size for ML Packages v21.10.x and up to 1000 pages for
other versions of ML Packages.
* CPU training was limited to 500 pages before 2021.10, it went up to 5000 pages for 2021.10 and with 2022.4 it came back down
to 1000 pages max.
:::

## Train and evaluate a model at the same time

Configure the training pipeline as follows:

* In the **Pipeline type** field, select **Full Pipeline run**.
* In the **Choose package** field, select the package you want to train and evaluate.
* In the **Choose package major version** field, select a major version for your package.
* In the **Choose package minor version** field, select a minor version for your package. It is strongly recommended to always use minor version 0 (zero).
* In the **Choose input dataset** field, select a representative training dataset.
* In the **Choose evaluation dataset** field, select a representative evaluation dataset.
* In the **Enter parameters** section, enter any environment variables defined, and used by your pipeline, if any. For most use cases, no parameter needs to be specified; the model is using advanced techniques to find a performant configuration. However, here are some environment variables you could use:
* `model.epochs` which customizes the number of epochs for the Training Pipeline (the default value is 100).
* Select whether to train the pipeline on GPU or on CPU. The **Enable GPU** slider is disabled by default, in which case the pipeline is trained on CPU. Using a GPU (AI Robot Pro) for training is at least 10 times faster than using a CPU (AI Robot). Moreover, training on CPU is supported for datasets up to 1000 images in size only. For larger datasets, you need to train using GPU.
* Select one of the options when the pipeline should run: **Run now**, **Time based** or **Recurring**. In case you are using the `auto_retraining` variable, select **Recurring**.

  ![Screenshot of the Create new pipeline run interface.](https://dev-assets.cms.uipath.com/assets/images/document-understanding/document-understanding-screenshot-of-the-create-new-pipeline-run-interface-116434-2b72d75d-e7eb9e07.webp)

After you configure all the fields, select **Create**. The pipeline is created.

## Artifacts

For an Evaluation Pipeline, the **Outputs** pane also includes an **artifacts** / **eval_metrics** folder which contains two files:

![Screenshot of the Outputs artifacts interface.](https://dev-assets.cms.uipath.com/assets/images/document-understanding/document-understanding-screenshot-of-the-output-artifacts-interface-119385-3c9ec68c-7dba13f6.webp)

* `evaluation_default.xlsx` is an Excel spreadsheet with a side-by-side comparison of ground truth versus predicted value for each field predicted by the model, as well as a per-document accuracy metric, in order of increasing accuracy. Hence, the most inaccurate documents are presented at the top to facilitate diagnosis and troubleshooting.
* `evaluation_metrics_default.txt` contains the F1 scores of the fields which were predicted.

  For line items, a global score is obtained for all columns taken together.
