Process Mining

DELIVERY:

RELEASE: 2021.10

Last updated Dec 20, 2024

Create an anonymized dataset

Introduction

In UiPath Process Mining it is possible to anonymize datasets to be used for development, testing or demo purposes.

You can create a production-like dataset that is still representative and useful, based on your input dataset. The data is anonymized to protect the privacy of individuals represented by the data.

In AppOne anonymization options are set by default.

Important: It is strongly recommended that you check the anonymization options before exporting the dataset if you have strict rules for anonymization.

Creating an anonymized dataset

Before you create an anonymized dataset in UiPath Process Mining you need to determine which attributes of your input dataset need to be anonymized and define how the values of these attributes must be displayed in the anonymized dataset.

Creating an anonymized dataset in UiPath Process Mining consists of two steps.

Set the appropriate anonymization options for all datasource attributes of the input tables that needs anonymization.
Export the dataset to your computer and distribute it.

Anonymization Options

For each datasource attribute of your input dataset you can define how the values must be visible in the resulting dataset.

Note: You must select an anonymization option for each datasource attribute of the input table that contains at least one datasource attribute that needs anonymization. If you do not want a specific attribute to be anonymized, select the Original values option.

In the Edit Datasource Attribute dialog you can select the applicable type of anonymization for the datasource attribute. See illustration below.

The following table describes the available options for anonymization.

Option	Description
Not set	The anonymization option is not set for this datasource attribute.
Original values	The original values of the datasource attribute will be displayed in the result dataset. You can use this option for attributes that do not need to be anonymized.
NULL	The values of the datasource attribute will be cleared in the result dataset, i.e. will be set to NULL.
Shuffle	The unique values of the datasource attribute will be randomly shuffled among the records in the result dataset.
String plus ID (over complete application)	The unique values of the datasource attribute will be replaced with the string entered in the Prefix field followed by a number. This option applies to all the tables in the dataset that have the same value. In the result dataset the corresponding values will have the same prefix in all the tables.
Hash values (over complete application)	The unique values in the datasource attribute will be replaced by a generated hash code. For example, a User ID can be replaced with a random hash code. This option applies to all the tables in the dataset that have the same value. In the result dataset the corresponding values will have the same hash values in all the tables, which enables you to compare the tables.
Use expression per value	The values of the result dataset attribute are set using an aggregate expression.
Use expression per record	The values of the result dataset attribute are set using an expression per record.

Important: Setting anonymization options for attribute values, will affect the results of expressions or metrics in which the attributes is used. Also, be careful when anonymizing attributes that occur in join expressions.

Examples

Below is an example of the result datasets when using the different options

Original values	NULL	Suffle values	String+ID	Hash	Expression per value (* 8)	Expression per record (<number_attribute> * 3)
1,00	NULL	4,00	Amount 1	2jmj7l5rSw0yVb/vlWAYkK/YBwk=	8,00	8,00
1,00	NULL	4,00	Amount 1	2jmj7l5rSw0yVb/vlWAYkK/YBwk=	8,00	12,00
1,00	NULL	4,00	Amount 1	2jmj7l5rSw0yVb/vlWAYkK/YBwk=	8,00	3,00
2,00	NULL	1,00	Amount 2	vlWAYkKWAYkrSw0yVb/saAshZ	16,00	9,00
4,00	NULL	8,00	Amount 3	l5rSw0yVb/2jmj7vlWAYkK/YBwk=	32,00	6,00
8,00	NULL	2,00	Amount 4	Sw0WAYkWAYk l5rSw0yVb/zzZa	64,00	12,00

Specifying anonymization settings

Follow these steps to define the anonymization settings for the datasource attributes.

Step	Action
1	Go to the Data tab in the developer interface.
2	Double-click on the datasource attribute for which you want to define anonymization settings.
3	Go to the Anonymization section of the Edit Datasource dialog.
4	Select the applicable type of anonymization for this datasource attribute from the Type drop-down list.
5	Repeat steps 1 to 4 for each datasource attribute of your input dataset that you want to want to encrypt or remove.

Export the dataset

Follow these steps to export the anonymized dataset.

Step	Action
1	Click on the logo icon and select Advanced -> Export input dataset…. TheExport Dataset dialog is displayed.
2	Select the Anonymize data option. Note: The dataset name will be expanded with `Anonymized`.
3	Click on Download to download the anonymized dataset to your computer.
4	Distribute the .zip file.

Important:

Anonymization is only available for input tables (connection string tables and join tables). You cannot use it for system tables or persistent tables.

Anonymization is also not possible with tables that use live data.

On this page