Subscribe

UiPath Process Mining

The UiPath Process Mining Guide

The first table group of a connector usually is the group of input tables. In this group, the raw data from the source system is defined. This is the only group in which there is a dependency on how the data is stored in the source system. Later transformations are built on top of these input tables. The input tables are therefore the best place to filter on required data and to cast attributes to the correct data types.

It is advised to always apply the filtering that is required for correctly transforming the data. Also in the scenario that filtering is already applied when extracting the data. In this way, the connector can more easily be reused when other extraction methods are used.

Transform input data to the correct format

The first step when building a connector is to consider the input data and the data model we want to transform the data to. In general, the following types of input data are important for Process Mining.

Type of data

Description

Transactional data

the mandatory attributes for Process Mining used for generating the Process Graph and most other views.

Master data

additional information and context about the cases and events that are represented in the transactional data.

Master data is usually straightforward, while transactional data can have many different formats. Almost all of these formats of transactional data can be brought down to the following two options:

  • One or more tables with multiple timestamp columns.
  • One or more tables with a single timestamp column (transaction log).

Examples

Compare the two formats below that show an example of invoice data.

Example 1

The illustration below displayes a table with multiple timestamp columns.

Each record in this table represents one invoice, and the columns contain information about when a certain event took place.

This way of data storage has its limitations since a column will only allow for one value and typically these values will be overridden when a field is updated. A lot of information that is useful for Process Mining might be lost. For example, the price on an invoice can be changed multiple times, but only the last price change is available in the data. In this way, information about previous price changes is lost.

Example 2

The illustration below displayes a transaction log table.

Each record in this table represents one event for a certain invoice. One column indicates when an event took place and another column indicates which event took place.

Define input tables

It is good practice to consider the input format of the tables in the source system before you continue with the next transformation steps. This will help you in understanding which input tables are required and which transformations are needed to transform the input to the data model.

Define the input tables that are needed and group these tables together in a logical way. This improves the readability of the connector and makes it easier to maintain it. Think about which attributes should be available in the input tables. Often your data will contain attributes that are not needed, which you can exclude from the connector.

Rename the attributes as available in the source system to human-readable names.

It is important to check the data types for the input attributes. For most attributes this will be text, but they can also be numeric values or dates. Common issues with the data types are:

  • Text fields consisting of numbers with leading zeros (00348) are recognized as doubles or integers in stead of text. These leading zeros are removed when these attributes are cast to a numeric data type, which can lead to issues when joining tables.
  • Dates and datetimes are recognized as integers or text. This can lead to problems when trying to set up logic that assumes values are dates. For example, if you subtract one date from another, the result may be different than if you subtract the integer representations of both from each other.

Updated 3 months ago

1. Input


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.