process-mining
2022.10
false
Process Mining
Automation CloudAutomation Cloud Public SectorAutomation SuiteStandalone
Last updated Aug 14, 2024

Loading data using CData Sync

Note: The information in this page is based on CData Sync 2022 - 22.0.8342.0. If you use an other version of CData Sync, field names or functions may be different.

Introduction

CData Sync is a tool that is used to extract data from source systems into Process Mining. The supported source systems can be found on the Sources page on the CData Sync website. Refer to the official CData Sync website for more information on CData Sync.

This page describes how to use CData Sync to load data from your source system into a process app in Process Mining on-premises (Automation Suite).



Prerequisites

It is assumed that you have:

Pseudonymizing data

Using CData Sync, data fields can be pseudonymized during extraction. This way, sensitive data fields can be hashed at the source before it reaches the Process Mining product. Follow these steps to pseudonymize data.

  1. Go to the desired task of a job in CData.
  2. Edit the query.
  3. Add the hashbytes() function to each field that needs to be pseudonymized. For example

    REPLICATE [SOURCE_TABLE] SELECT hashbytes('SHA2_256',[SOURCE_FIELD]) AS [DESTINATION_FIELD] FROM [SOURCE_TABLE];

This way, all values will be hashed with a SHA2 256 hash. For more information, see SHA-2. If desired, the hashed values can be transformed into values like "Person 1", "Person 2", "Person 3" by writing SQL queries for this in the data transformations.

Loading data using CData Sync

Setting up data load using CData Sync requires several steps to be performed.

  1. Setting up the source connection;
  2. Creating the SQL Server destination connection;
  3. Creating a job;
  4. Calling the data ingestion API;
  5. Running the job.

The steps are described in detail below.

Create a source connection

Note: Refer to the Configuring CData Sync page for your app template for specific settings for setting up the source connection.
Note: Check the CData Sync documentation for any specific settings or requirements for your source system.

Follow these steps to create the source connection.

  1. In the CData Sync Admin console, create a new Source connection for your source system.

    Note:

    If your source system is not in the list you can click on + Add More to display a list of all available source CData Sync Connectors. Select the connector for your source system and click on Download & Install.

  2. Enter a descriptive name for the source connection in the Name field.
  3. Enter the required properties to set up a connection with your source system.
  4. Create and test the connection.

Setting up a source connection for .csv or .tsv files

If you want to set up a source connection for .csv or .tsv files make sure to:
  • Select CSV as the source system to which you want to create a connection from the list.
  • Set the URI to the path where the .csv or .tsv files are stored.
  • Set the correct ConnectionType. Use Local when the files are stored on the same server as CData Sync.
  • Set FMT to the appropriate value used in the data files.

Define the following settings in the Advanced tab in the Connection Settings panel to make sure the data is properly loaded into Process Mining.

SectionParameterValue
OtherExclude File ExtensionsTrue
OtherInclude FilesAdd ,TSV to the setting if you want to upload .tsv files
SchemaType Detection SchemeNone
Data FormattingPush Empty Values As NullTrue

Retrieving the SQL Server database parameters

To set up a SQL Server destination connection you need the following setup parameters for the SQL Server database.

  • Server
  • Database
  • Schema
  • Role

Create a destination connection

To set up a SQL Server destination connection you need the following setup parameters for the SQL Server database.

  • Server
  • Database
  • Schema
  • Role

Follow these steps to create the SQL Server destination connection.

  1. Define a new connection of type SQL Server.
  2. Enter a descriptive name for the destination connection. For example SQLServer_IM.
  3. Configure the Settings to connect to your SQL Server database using the SQL Server database setup credentials retrieved
    Note: Server must be specified as <Server>,<Port>.
    Note: User / Password must be the SQL credentials of the user that has permissions to write into the database. See also Setting up a SQL Server account for data upload using an extractor. Password may not contain a semicolon ;.
  4. Create and test the connection.

See the illustration below.



Creating the extraction job

Important:

The input data must meet the format as required for the app template you are using to create your process app. See App Templates.

Make sure to add the suffix _raw to the table names.

Follow these steps to create the extraction job.

  1. Click on JOBS in the menu bar and go to the Sources tab of the Add Connection panel.
  2. Click on + Create Job... to add a new job.
  3. Enter a descriptive name for the job in the Job Name field. For example, ServiceNow_to_SQLServer.
  4. Select the source connection created in Step 1: Setting up the source connection the source connection from the Source list.
  5. Select the SQL Server connection created in Step 3: Create destination connection from the Destination list.
  6. Make sure the option Standard is selected as the Replication Type and click on +Create.
  7. Click on +Add Tasks.

    • Select all the source tables in the list.
    • Click on Add.
  8. Go to the Advanced tab in the Job Settings panel.

    • Locate the Destination schema entry and copy the Schema retrieved in Step 2: Retrieving the SQL Server database parameters.
    • Select the Drop Table option to prevent the data to be appended to the table.
    • Enable the checkbox Enable Parallel Processing and enter 8 in the Worker Pool field to improve loading speed.
    • Make sure the Replication Interval and Replication Interval Unit are set so the resulting period is equal or larger than the extraction period.
  9. Click on Save Changes.

Calling the data ingestion API

Follow these steps to edit the post-job event to call the data ingestion API.

  1. Go to the Events tab in the Job Settings panel.
  2. Edit the Post-Job Event section and add the ingestion API call ensuring to replace the server, organization, tenant, and Process Mining app ID as described below.
    Note: You can copy the app ID from the All process apps list in the Process Mining portal. Make sure to select ID in the Columns list.

    Replace

    With your

    my-uipath-server.com

    server

    default

    organization

    defaulttenant

    tenant

    98dfd1d5-9e42-4f0e-9a0a-8945629f01b3

    app ID

    <api:set attr="http.url" value="https://my-uipath-server.com/default/defaulttenant/processMining_/api/v2/apps/98dfd1d5-9e42-4f0e-9a0a-8945629f01b3/transform/unauthenticated"/> 
    <api:call op="httpPost" in="http"/><api:set attr="http.url" value="https://my-uipath-server.com/default/defaulttenant/processMining_/api/v2/apps/98dfd1d5-9e42-4f0e-9a0a-8945629f01b3/transform/unauthenticated"/> 
    <api:call op="httpPost" in="http"/>


  3. Save your changes.
  4. Click on JOBS in the menu bar and locate the destination job.
  5. Click on the Run All Queries icon to check if the job runs correctly.

Logging information

You can add additional statements to obtain logging information on the API call. Follow these steps.

  1. Add the following statements between the lines previously added in the Post-Job event.

    <api:set attr="http.verbosity" value="5"/>

    <api:set attr="http.logfile" value="D:/mydir/cdata_log.txt"/>

    See the illustration below.



  2. Go to the Logging & History tab in the Job Settings panel.
  3. Select Verbose from the Logfile Verbosity list box.
  4. Run the job and check the log file that is created.

Running the CData Sync extraction job

Follow these steps to run the extraction job.

  1. Click on JOBS in the menu bar and locate the extraction job created in Step 4 Creating the extraction job.
  2. Click on the Run all queries icon. See the illustration below.



  3. Wait until the job has finished. Depending on the amount of data, this can take several minutes.
  4. Go to the Process Mining portal and check the Last ingestion data for the process app to see if the data load has completed successfully. Note: the date is only updated after all data has been processed. Depending on the amount of data, this might take several minutes up to an hour.

Scheduling jobs

If you want to run the extraction job on a regular interval, you can use the CData Sync Scheduler to define a schedule.

Follow these steps to schedule an extraction job.

  1. Open the CData Sync extraction job created in Step 4: Creating the extraction job.
  2. Go to the Schedule tab in the Job Settings panel.



Refer to the official CData Sync documentation for more information on how to schedule jobs.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.