Subscribe

UiPath Process Mining

The UiPath Process Mining Guide

Loading data using CData Sync

📘

Note:

The information in this page is based on CData Sync 2022 - 22.0.8342.0. If you use an other version of CData Sync, field names or functions may be different.

📘

Note:

With CData Sync, the data is loaded into a blob store. Therefore, when loading data into Process Mining Cloud, UiPath Automation Cloud cannot check the IP-address. This means that if IP Restriction is set up for your tenant, it will not be enforced when loading data using CData Sync from a machine that is not in a trusted IP range, and the data is uploaded to Process Mining Cloud.

Introduction

CData Sync is a tool that is used to extract data from source systems into Process Mining. The supported source systems can be found on the Sources page on the CData Sync website. Refer to the official CData Sync website for more information on CData Sync.

This page describes how to use CData Sync to load data from your source system into a process app in Process Mining (Cloud). This requires two Jobs in CData Sync as described below.

JobDescription
Extraction jobused to extract the data from a source system into the Azure Blob store of a process app.
Marker file jobused to let UiPath Process Mining know that the extraction is finished.

To create the jobs you need 3 connections in CData Sync, as described below.

ConnectionDescription
Source connectionconnection to the source system to load data from.
Destination connectionconnection to the Azure Blob store that belongs to the process app to load data into.
Marker file connectionconnection to load the marker file. Note that the marker file connection can be reused for other process apps as well.

See the illustration below for an overview.

14781478

Prerequisites

It is assumed that you have:

CData AzureBlobDestination Provider Version

🚧

Note:

The CData AzureBlobDestination Provider Version must be 22.0.8348.0 or newer. The version number is displayed when you select the AzureBlob destination in CData Sync in 3. Set up the AzureBlob destination.

If you have an older version of the CData AzureBlobDestination Provider you can update to the required version. Follow these steps.

StepAction
1Download the .zip file containing the AzureBlobDestination update via this link https://cdatabuilds.s3.amazonaws.com/support/SAPHA_8348_free.zip
2Go to the official CData Sync documentation and follow the instructions as described in the section Install a Connector using the local file system.

Loading data using CData Sync

Setting up data load using CData Sync requires several steps to be performed.

  1. Set up a the marker file source connection
  2. Set up a source connection
  3. Set up the AzureBlob destination
  4. Create the extraction job
  5. Create the marker sync job
  6. Edit the post-job events to link the jobs
  7. Running the CData Sync extraction job

The steps are described in detail below.

1: Setting up the marker file source connection

📘

Note:

If you have already set up a marker source file connection for a loading data for another process app, you can re-use the marker file source connection and go to 2. Set up a source connection.

Follow these steps to set up the marker file source connection.

StepAction
1Download the static marker file marker.csv.
2Create a new folder on a file location accessible to the CData server (for example C:\marker) and save the file in this folder.
Make sure that the marker file is only file in the folder.
3Click on Connections in the menu bar of the CData Sync Admin console and go to the Sources tab of the Add Connection panel.
4Select CSV as the source for the connection.
5Enter a descriptive name for the source connection in the Name field. For example, Marker_source.
6Enter the absolute path to the directory where the marker.csv file is saved in the URI field, for example, C:/marker.
7Go to the Advanced tab.
• Locate the Other section and make sure Insert Mode is set to SingleFile.
8Click on Create & Test.

2: Setting up the source connection

📘

Note:

Refer to the Configuring CData Sync page for your app template for specific settings for setting up the source connection.

Follow these steps to set up the source connection.

StepAction
1Click on Connections in the menu bar of the CData Sync Admin console and go to the Sources tab of the Add Connection panel.
2Select the source system to which you want to create a connection from the list.

Note: If your source system is not in the list you can click on + Add More to display a list of all available source CData Sync Connectors. Select the connector for your source system and click on Download & Install.
3Enter a descriptive name for the source connection in the Name field.
4Enter the required properties to set up a connection with your source system.
5Test the connection and save the connection.

Setting up a source connection for .csv or .tsv files

If you want to set up a source connection to load data from .csv or .tsv files make sure to:

  • Select CSV as the source system to which you want to create a connection from the list.
  • Set the URI to the path where the .csv or .tsv files are stored.
    The CSV source connection can be set either using a local file path or an online document storage using the correct credentials.
  • Set FMT to TabDelimited.

Define the following settings in the Advanced tab in the Connection Settings panel.

SectionParameterValue
OtherExclude File ExtensionsTrue
OtherInclude FilesAdd ,TSV to the setting if you want to upload .tsv files
SchemaType Detection SchemeNone
Data FormattingPush Empty Values As NullTrue

3: Creating an AzureBlob destination connection

Setup credentials for AzureBlob

To set up an AzureBlob destination connection, you need the following setup credentials for the AzureBlob.

Determine the setup parameters from the upload url as described below.

Example

The table below displays the parameters retrieved from the example download uri:
https://pmdoctestcd.blob.core.windows.net/c7045b18-be6f-4534-2fba-0b1f42c1e6d3?sv=2020-06-10&si=sap-c7044b18-be8f-4534-8fba-0b2f42c1e7c7&sr=c&sig=cjknPMhIeHvaKtpMgNFmUVp7KHQirhR1m1WCMkCiZUg%3D

ParameterDescriptionExample
azure access signatureeverything after the question marksv=2020-06-10&si=sap-c7044b18-be8f-4534-8fba-0b2f42c1e7c7&sr=c&sig=cjknPMhIeHvaKtpMgNFmUVp7KHQirhR1m1WCMkCiZUg%3D
accountthe first part of the urlpmdoctestcd
containerthe app id, or the first guid in the urlc7045b18-be6f-4534-2fba-0b1f42c1e6d3

Setup the AzureBlob destination connection.

Follow these steps to create the AzureBlob destination connection.

StepAction
1Go to the Destinations tab in the Add Connection dialog and define a new connection of type AzureBlob.
2Check if CData AzureBlobDestination Provider CData AzureBlobDestination Provider is 22.0.8348.0 or higher. See also CData AzureBlobDestination Provider Version.
3Enter a descriptive name for the destination connection. For example, AzureBlob_IM.
4Enter the AzureBlob credential parameters retrieved from the upload url.
5Go to the Advanced tab.
• Locate the Miscellaneous section and set Insert Mode to SingleFile.
• Locate the Other section and set Include Column Headers to True.
6Test the connection and click on Create & Test to set up the connection.

See the illustration below for an example.

972972

4: Creating the extraction job

Follow these steps to create the extraction job.

📘

Note:

The input data must meet the format as required for the app template you are using to create your process app. See App templates.

Make sure to add the suffix _raw to the table names.

StepAction
1Click on JOBS in the menu bar and go to the Sources tab of the Add Connection panel.
2Click on +Create Job to add a new job.
3Enter a descriptive name for the job in the Job Name field. For example, ServiceNow_to_AzureBlob.
4Select the source connection created in 2: Setting up the source connection the source connection from the Source list.
5Select the AzureBlob connection created in 3: Create destination connection from the Destination list.
6Make sure the option Standard is selected as the Replication Type and click on Create.
7Click on +Add Tasks.
• Select all the source tables in the list.
• Click on Add.
8Go to the Advanced tab in the Job Settings panel.
• Select the Drop Table option to prevent the data from being appended to the table. See Incremental ingestion.
• Enable the checkbox Enable Parallel Processing and enter 8 in the Worker Pool field to improve loading speed.
9Click on Save Changes.

See the illustration below for an example.

616616

5: Creating a marker file destination job

Follow these steps to create the marker file destination job.

StepAction
1Click on +Create Job... to add a new job.
2Enter a descriptive name for the job in the Name field. For example, ServiceNow_ AzureBlob_marker_sync.
3Select the marker file source connection created in 1: Setting up the marker file source connection from the Source list.
4Select the AzureBlob connection created in 3: Create destination connection from the Destination list.
5In this created job, add all tables to be synced:

Click on Add custom query and enter the following query:
REPLICATE [marker/marker] SELECT * FROM [marker]

Click on Save.
6Click on Run to check if the job runs correctly.

See the illustration below.

13241324

6: Edit the post-job event to link the jobs

Follow these steps to edit the post-job event to link the extraction job and the marker file destination job. This way, the marker file job will be executed once the extraction job is finished.

StepAction
1Go to the JOBS tab and open the extraction job created in 4: Creating the extraction job.
1Go to the Events tab in the Job Settings panel.
2Edit the Post-Job Event section to add the code displayed below.

Note: set JobName value to actual name of the marker file job.
3Click on Save Changes.
<!-- Start Executing different Job -->
<api:set attr="job.JobName"        value="ServiceNow_ AzureBlob_marker_sync"/> 
<api:set attr="job.ExecutionType"  value="Run"/> 
<api:set attr="job.WaitForResults" value="true"/> 
<api:call op="syncExecuteJob" in="job"/>

See the illustration below.

853853

7: Running the CData sync extraction job

Follow these steps to run the extraction job.

StepAction
1Click on JOBS in the menu bar and locate the extraction job created in 4: Creating the extraction job.
2Click on the Run all queries icon. See the illustration below.
3Wait until the job has finished. Depending on the amount of data, this can take several minutes.
4Go to the Process Mining portal and check the Last ingestion data for the process app to see if the data load has completed successfully.
Note: the date is only updated after all data has been processed. Depending on the amount of data, this might take several minutes up to an hour.
14181418

Scheduling jobs

If you want to run the extraction job on a regular interval, you can use the CData Sync Scheduler to define a schedule.
Follow these steps to schedule an extraction job.

StepAction
1Open the CData Sync extraction job created in 4: Creating the extraction job.
2Go to the Schedule tab in the Job Settings panel.
16951695

Refer to the official CData Sync documentation for more information on how to schedule jobs.

Loading data from multiple source systems

Using CData Sync, it is possible to load data from multiple different source systems into a single process app. To do that, multiple extraction jobs have to be created, each having a corresponding source connection. Each extraction job has to call the next one in its post-job events, such that all jobs are executed one-by-one. The final extraction job has to call the marker file job to signal that the extraction has been completed. See the illustration below for an overview.

14841484

Updated 2 days ago

Loading data using CData Sync


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.