# Evaluation pipelines

> An Evaluation Pipeline is used to evaluate a trained machine learning model. To use this pipeline, the package must contain code to evaluate a model (the `evaluate()` function in the **train.py** file). This code, together with a dataset or sub-folder within a dataset, produce a score (the return of the `evaluate()` function) and any arbitrary outputs the user would like to persist in addition to the score.

An Evaluation Pipeline is used to evaluate a trained machine learning model. To use this pipeline, the package must contain code to evaluate a model (the `evaluate()` function in the **train.py** file). This code, together with a dataset or sub-folder within a dataset, produce a score (the return of the `evaluate()` function) and any arbitrary outputs the user would like to persist in addition to the score.

## Creating evaluation pipelines

Create a new evaluation pipeline as described [here](https://docs.uipath.com/ai-center/automation-suite/2.2510/user-guide/managing-pipelines#managing-pipelines). Make sure to provide the following evaluation pipeline specific information:

* In the **Pipeline type** field, select **Evaluation run**.
* In the **Choose evaluation dataset** field, select a dataset or folder from which you want to import data for evaluation. All files in this dataset/folder should be available locally during the runtime of the pipeline, being passed to the argument to your `evaluate()` function.
* In the **Enter parameters** section, enter the environment variables defined and used by your pipeline, if any. The environment variables are:
  + `artifacts_directory`, with default value **artifacts**: This defines the path to a directory that will be persisted as ancillary data related to this pipeline. Most, if not all users, will never have the need to override this through the UI. Anything can be saved during pipeline execution including images, pdfs, and subfolders. Concretely, any data your code writes in the directory specified by the path `os.environ['artifacts_directory']` will be uploaded at the end of the pipeline run and will be viewable from the **Pipeline details** page.
  + `save_test_data`, with default value **false**: If set to **true**, `data_directory` folder will be uploaded at the end of the pipeline run as an output of the pipeline under directory `data_directory`.
    :::note
    The pipeline execution might take some time. Check back to it after a while to see its status.
    :::

After the pipeline was executed, in the **Pipelines** page, the pipeline's status changed to **Successful**. The **Pipeline Details** page displays the arbitrary files and folders related to the pipeline run. In our example, the run created a file called `my-evaluate-artifact.txt`.

## Conceptual analogy for building your own evaluation pipeline

This example is a conceptually analogous execution of an evaluation pipeline on some package, for example version 1.1, the output of a training pipeline on version 1.0. 
:::note
This is a simplified example. Its purpose is to illustrate how datasets and packages interact in an evaluation pipeline. The steps are merely conceptual and do not represent how the platform works.
:::

1. Copy package version 1.1 into `~/mlpackage`.
2. Copy the **evaluation dataset** or the **dataset subfolder** selected from the UI to `~/mlpackage/evaluation_data`.
3. Execute the following Python code:
   ```
   from train import Main 
   m = Main() 
   score = m.evaluate('./evaluation_data')
   ```

    The returned score is surfaced in the grid showing pipelines and the `results.json` file.

4. Persist artifacts if written, snapshot data if `save_test_data` is set to **true**.

## Pipeline outputs

The `_results.json` file contains a summary of the pipeline run execution, exposing all inputs/outputs and execution times for an evaluation pipeline.

```
{
    "parameters": {
        "pipeline": "< Pipeline_name >",
        "inputs": {
            "package": "<Package_name>",
            "version": "<version_number>",
            "evaluation_data": "<storage_directory>",
            "gpu": "True/False"
        },
        "env": {
            "key": "value",
            ...
        }
    },
    "run_summary": {
     "execution_time": <time>, #in seconds 
     "start_at": <timestamp>, #in seconds 
     "end_at": <timestamp>, #in seconds 
     "outputs": {
        "score": <score>, #float
        "train_data": "<test_storage_directory>", 
        "evaluation_data": "<test_storage_directory>/None", 
        "artifacts_data": "<artifacts_storage_directory>",
         }
    }
}
```

**Artifacts** folder, visible only if not empty, is a folder regrouping all the artifacts generated by the pipeline and saved under the `artifacts_directory` folder.

**Dataset** folder, existing only if `save_data` was set to the default **true** value, is a copy of the evaluation dataset folder.

## Model governance

As in [training pipelines](https://docs.uipath.com/ai-center/automation-suite/2.2510/user-guide/training-pipelines#training-pipelines), a user can set the parameter `save_test_data` = `true` to snapshot data passed in for evaluation.