Abonnieren

UiPath Process Mining

The UiPath Process Mining Guide

Structure of transformations

Überblick

Below is an overview of the transformation steps of the UiPath Process Mining app templates.

580

The models\ folder is organized according the structure of the transformation steps.

307

1. Eingabe

The input step is used to load the raw data. The following operations are typically done to prepare the data for next transformation steps:

  • Type cast fields to the appropriate data types.
  • Filter tables to reduce data size early in the transformations.

📘

Hinweis:

It is recommended to reduce data size already in the extraction where possible.

For performance reasons, it is advised to use the pm_utils.create_index(input_table) method as a pre_hook when defining the input models. For more information, see the official dbt documentation on pre-hook & post-hook.

Naming convention

If you expect name clashes with table names in next transformation steps, it is best practice to add the suffix _input to the input tables.

2. Entitäten

In the entities step, input tables are transformed to entity tables. Each entity required for the expected events should get its own table. See Designing an event log. Additionally, supporting entities can also be defined here.

In the below example 3 input tables Invoices_input, Invoice_types_input, and Customers_input are joined together to create the entity table Invoices.

621

Richtlinien

Follow these guideline when creating an entity table.

  • There is one entity ID field, which is unique for each data record.
  • All entity fields that are needed for data analysis are present.
  • All entity fields have names that are easy to understand.

When applicable, the entity table relates to another entity via an ID field. See the example below, where the invoice lines are related to the invoice entity via the Invoice_ID field.

525

Additional transformations

Not all input tables are transformed into entity tables. Also, other input tables may contain relevant information, such as the Customers table in the example. It may be convenient to define them in the entities step as separate tables such that they can be reused in the data transformations.

Naming convention

If the entity table names would lead to name clashes later on, add the suffix _base to the tables.

3. Ereignisse

📘

Hinweis:

The input for TemplateOne-SingleFile and TemplateOne-MultiFiles app templates is already a well defined event log for Process Mining. There is no need to transform the data from the source system into the events for Process Mining here. This means that the 3. events is not present in the transformations for TemplateOne-SingleFile and TemplateOne-MultiFiles process apps.

In this transformation step, event tables are created for each entity. See Designing an event log. Each record in an event table represents one event that took place. There are two scenarios on how the data is structured:

  • Timestamp fields: Fields on an entity table with a timestamp for an event. For example, the Invoice_created field in an Invoices table.
  • Transaction log: A list of events.

Based on how the data is structured, the transformations to create the event tables are different.

Timestamp fields

In this scenario, the values of a timestamp field must be transformed into separate records in an event table. The below example is an invoices table that contains three timestamp fields.

452

Each timestamp field is used to create a separate event table. For every record that the timestamp field contains a value, create a table with the Invoice ID, the name of the event (Activity), and the timestamp the event took place (Event end).

413

The Invoices_input table is split into Invoice_events_Create_invoice, Invoice_events_Delete_invoice, and Invoices_events_Change_invoice_price.

The separate event tables can then be merged into a single event table per entity, for example Invoices_events.

Transaktionsprotokoll

If events are stored in a transaction log, the relevant events per entity should be identified. Create a table per entity and store corresponding entity ID, the name of the event (Activity), and the timestamp the event took place (Event end).

In the below example, the transaction log contains events for the Purchase Order and Invoice entities.

590

The following fields are mandatory in an event table. All records in the event tables should contain a value for these fields.

FieldDescription
Entity IDID of the entity for which the event happens. For example, the Invoice ID.
ActivityThe activity describes which action took place on the entity.
Event endThe event end field indicates when the specific event was finished. Ideally, this should be a datetime field, rather than a date.

Naming convention

Name the tables according to the structure [Entity] + _events. For example, Purchase_order_events and Invoice_events.

4. Ereignisprotokolle

Single entity process

When the process contains one entity, no additional transformations are needed in this step. The single entity table and events tables are already in the correct format.

Multiple entity process

When multiple entities are involved in a process, the events of all entities need to be linked to the main entity that is considered the “Case” in the process. See Designing an event log. The below steps describe how to relate all events to the main entity, and how to combine them a single event log.

Entity relations

Create an “entity-relations” table to centralize the relationships between all entities. This entity-relations table will contain the ID fields of the related entities.

To create the entity-relations table, join all entity tables based on their ID fields:

  • Start with the main entity
  • Join related entities to the main entity with a left join.
  • If entities do not relate directly to the main entity, left join them to the related entities that are already joined to the main entity.

In the below example, there are three entities: Purchase order, Invoice line, and Invoice. The Purchase order is considered the main entity in the process. The Invoice line is directly linked to the Purchase order and the Invoice is linked indirectly via the Invoice line.

412 457
Entity_relations as ( 
    select 
        Purchase_orders.”Purchase_order_ID” 
        Invoice_lines.”Invoice_line_ID” 
        Invoices.”Invoice_ID” 
    from Purchase_orders 
    left join Invoice_lines 
        on Purchase_orders.“Purchase_order_ID” = Invoice_lines.”Purchase_order_ID” 
    left join Invoices 
        on Invoice_lines.”Invoice_ID” = Invoices.”Invoice_ID”        
)

Below is the resulting entity-relations table.

391

Relation tables

The individual relations between the main entity and each other entity are stored in separate tables, using the combined information from the entity relations table.

Relation_invoice_lines as ( 
    select 
        Entity_relations.”Purchase_order_ID” 
        Entity_relations.”Invoice_line_ID” 
    from Entity_relations 
    group by “Purchase_order_ID”, “Invoice_line_ID” 
)
297
Relation_invoices as ( 
    select 
        Entity_relations.”Purchase_order_ID” 
        Entity_relations.”Invoice_ID” 
    from Entity_relations 
    group by “Purchase_order_ID”, “Invoice_ID” 
)
293

Event log

The next step is to use these relations to add corresponding “Case ID” to each event table. The "Case ID" is obtained via the relation table, where the event information is obtained from the event table. To create the full event log, the event tables for each entity are unioned.

Purchase_order_event_log as ( 
    select 
        Purchase_order_events.”Purchase_order_ID”, 
        Purchase_order_events.”Activity”, 
        Purchase_order_events.”Event_end” 
    from Purchase_order_events 
    union all 
    select 
        Relation_invoice_lines.”Purchase_order_ID” 
        Invoice_line_events.”Activity” 
        Invoice_line_events.”Event_end” 
    from Invoice_line_events 
    inner join Relation_invoice_lines 
        on Invoice_line_events.”Invoice_line_ID” = Relation_invoice_lines.”Invoice_line_ID” 
    union all 
    select 
        Relation_invoices.”Purchase_order_ID” 
        Invoice_events.”Activity” 
        Invoice_events.”Event_end” 
    from Invoice_events 
    inner join Relation_invoices 
        on Invoice_events.”Invoice_line_ID” = Relation_invoices.”Invoice_line_ID” 
)

Naming convention

If the event log table name can lead to name clashes at a later stage, add the suffix  _base to the name of the event log tables.

5. Geschäftslogik

In the last transformation step, business logic is added as needed for data analysis. Additional derived fields can be added to existing tables here. For example, specific throughput times or Boolean fields that are used in KPIs in dashboards.

In Process Mining, there are two additional standard tables defined in this transformation step: Tags and Due dates.

Tags

Tags are properties of cases, which signify certain business rules. Tags are typically added to make it easy to analyze these business rules. For example:

  • Invoice paid and approved by the same person.
  • Invoice approval took more than 10 days.
  • Check invoice activity skipped.

Each record in the tags table represents one tag that occurred in the data for a specific case. The mandatory fields for this table are the  "Case ID" and the "Tag". Not all cases will have a tag and some cases may have multiple tags. Below is an example Tags table.

332

Due dates

Due dates represent deadlines in the process. These are added to the data to analyze whether activities are performed on time for these due dates or not.

Each record in the due dates table represents one due date for a certain event. Example due dates are:

  • a payment deadline for a payment event.
  • eine Genehmigungsfrist für einen Genehmigungstermin.

The mandatory fields for this table are the Event ID, Due date, Actual date, and Expected date.

551

Not all events will have a due date and some events may have multiple due dates.

Vor etwa einem Monat aktualisiert

Structure of transformations


Auf API-Referenzseiten sind Änderungsvorschläge beschränkt

Sie können nur Änderungen an dem Textkörperinhalt von Markdown, aber nicht an der API-Spezifikation vorschlagen.