- Release Notes
- Getting started
- Notifications
- Projects
- Datasets
- Data Labeling
- ML packages
- Out of the box packages
- Pipelines
- ML Skills
- ML Logs
- Document UnderstandingTM in AI Center
- AI Center API
- Licensing
- AI Solutions Templates
- How to
- Basic Troubleshooting Guide
About pipelines
A Pipeline is a description of a machine learning workflow, including all of the functions in the workflow and the order of executions of these functions. The pipeline includes the definition of the inputs required to run the pipeline and outputs received from it.
A Pipeline Run is an execution of a pipeline based on code provided by the user. Once completed, a pipeline run has associated outputs and logs.
There are three types of pipelines:
- Training pipelines - takes as input a package and a dataset, and produces a new package version.
- Evaluation pipelines - takes as input a package version and a dataset, and produces a set of metrics and logs.
- Full pipelines - runs a processing function, a training pipeline, and immediately after an evaluation pipeline.
Tip:
The examples used to explain these concepts are based on a sample package, tutorialpackage.zip, that you can download by clicking the button below. We recommend you to upload this sample package if it's your first time when you learn about pipelines. Make sure you enable it for training.
The Pipelines page, accessible from the Pipelines menu after selecting a project, enables you to view all the pipelines within that project, along with information about their type, associated package and package version, status, creation time, duration, and score. Here you can create new pipelines, access existing pipelines' details, or remove pipelines.
A pipeline run can be in one of the following statuses:
- Scheduled –A pipeline that has been scheduled to start in the future (for example at 1am every Monday). When the date-time set for a pipeline to start running is reached, the pipeline is queued to run.
- Packaging – A pipeline that has started building the docker image on which the job itself will be executed. If this is the first time you are training this specific version of the ML Package this can takes up to 20 minutes.
- Waiting for resources – A pipeline that is looking for available license to execute. Pipeline checks every 5 minutes if new license is available (this will happen if you remove running ML Skills or if another Pipeline completed) and will start as soon as this is the case.
- Running – A pipeline that has started and is executing.
- Failed – A pipeline that failed during execution.
Note: Pipelines can fail if the dataset set size exceeds the 50 Gb limit.
- Killed – A pipeline that was executing until the user explicitly called for its termination.
- Successful – A pipeline that completed execution.
Note: Pipelines are automatically killed after seven days to avoid being stuck for longer periods of time and consuming licenses.