Process Mining
2021.10
false
Banner background image
Process Mining
Last updated Apr 2, 2024

Create an anonymized dataset

Introduction

In UiPath Process Mining it is possible to anonymize datasets to be used for development, testing or demo purposes.

You can create a production-like dataset that is still representative and useful, based on your input dataset. The data is anonymized to protect the privacy of individuals represented by the data.

In AppOne anonymization options are set by default.

Important: It is strongly recommended that you check the anonymization options before exporting the dataset if you have strict rules for anonymization.

Creating an anonymized dataset

Before you create an anonymized dataset in UiPath Process Mining you need to determine which attributes of your input dataset need to be anonymized and define how the values of these attributes must be displayed in the anonymized dataset.

Creating an anonymized dataset in UiPath Process Mining consists of two steps.

  1. Set the appropriate anonymization options for all datasource attributes of the input tables that needs anonymization.
  2. Export the dataset to your computer and distribute it.

Anonymization Options

For each datasource attribute of your input dataset you can define how the values must be visible in the resulting dataset.

Note: You must select an anonymization option for each datasource attribute of the input table that contains at least one datasource attribute that needs anonymization. If you do not want a specific attribute to be anonymized, select the Original values option.

In the Edit Datasource Attribute dialog you can select the applicable type of anonymization for the datasource attribute. See illustration below.



The following table describes the available options for anonymization.

Option

Description

Not set

The anonymization option is not set for this datasource attribute.

Original values

The original values of the datasource attribute will be displayed in the result dataset. You can use this option for attributes that do not need to be anonymized.

NULL

The values of the datasource attribute will be cleared in the result dataset, i.e. will be set to NULL.

Shuffle

The unique values of the datasource attribute will be randomly shuffled among the records in the result dataset.

String plus ID (over complete application)

The unique values of the datasource attribute will be replaced with the string entered in the Prefix field followed by a number.

This option applies to all the tables in the dataset that have the same value. In the result dataset the corresponding values will have the same prefix in all the tables.

Hash values (over complete application)

The unique values in the datasource attribute will be replaced by a generated hash code. For example, a User ID can be replaced with a random hash code.

This option applies to all the tables in the dataset that have the same value. In the result dataset the corresponding values will have the same hash values in all the tables, which enables you to compare the tables.

Use expression per value

The values of the result dataset attribute are set using an aggregate expression.

Use expression per record

The values of the result dataset attribute are set using an expression per record.

Important: Setting anonymization options for attribute values, will affect the results of expressions or metrics in which the attributes is used. Also, be careful when anonymizing attributes that occur in join expressions.

Examples

Below is an example of the result datasets when using the different options

Original values

NULL

Suffle values

String+ID

Hash

Expression per value (* 8)

Expression per record (<number_attribute> * 3)

1,00

NULL

4,00

Amount 1

2jmj7l5rSw0yVb/vlWAYkK/YBwk=

8,00

8,00

1,00

NULL

4,00

Amount 1

2jmj7l5rSw0yVb/vlWAYkK/YBwk=

8,00

12,00

1,00

NULL

4,00

Amount 1

2jmj7l5rSw0yVb/vlWAYkK/YBwk=

8,00

3,00

2,00

NULL

1,00

Amount 2

vlWAYkKWAYkrSw0yVb/saAshZ

16,00

9,00

4,00

NULL

8,00

Amount 3

l5rSw0yVb/2jmj7vlWAYkK/YBwk=

32,00

6,00

8,00

NULL

2,00

Amount 4

Sw0WAYkWAYk l5rSw0yVb/zzZa

64,00

12,00

Specifying anonymization settings

Follow these steps to define the anonymization settings for the datasource attributes.

Step

Action

1

Go to the Data tab in the developer interface.

2

Double-click on the datasource attribute for which you want to define anonymization settings.

3

Go to the Anonymization section of the Edit Datasource dialog.

4

Select the applicable type of anonymization for this datasource attribute from the Type drop-down list.

5

Repeat steps 1 to 4 for each datasource attribute of your input dataset that you want to want to encrypt or remove.

Export the dataset

Follow these steps to export the anonymized dataset.

Step

Action

1

Click on the logo icon and select Advanced -> Export input dataset….

TheExport Dataset dialog is displayed.

2

Select the Anonymize data option.

Note: The dataset name will be expanded with Anonymized.

3

Click on Download to download the anonymized dataset to your computer.

4

Distribute the .zip file.

Important:

Anonymization is only available for input tables (connection string tables and join tables). You cannot use it for system tables or persistent tables.

Anonymization is also not possible with tables that use live data.

  • Introduction
  • Creating an anonymized dataset
  • Anonymization Options
  • Examples
  • Specifying anonymization settings
  • Export the dataset

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.