- Getting started
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields (previously entities)
- Labels (predictions, confidence levels, hierarchy, etc.)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Reviewed and unreviewed messages
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Create a data source in the GUI
- Uploading a CSV file into a source
- Create a new dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amend a dataset's settings
- Delete messages via the UI
- Delete a dataset
- Delete a source
- Export a dataset
- Using Exchange Integrations
- Preparing data for .CSV upload
- Model training and maintenance
- Understanding labels, general fields and metadata
- Label hierarchy and best practice
- Defining your taxonomy objectives
- Analytics vs. automation use cases
- Turning your objectives into labels
- Building your taxonomy structure
- Taxonomy design best practice
- Importing your taxonomy
- Overview of the model training process
- Generative Annotation (NEW)
- Understanding the status of your dataset
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and recall
- How does Validation work?
- Understanding and improving model performance
- Why might a label have low average precision?
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining
- Licensing information
- FAQs and more
Understanding the data structure and permissions
Within the platform, data is structured and stored in a hierarchical manner, which comprises of 3 main components - data sources, datasets, projects. If you are an Automation Cloud user, these three components will be stored within your cloud tenant(s). Access to each of these is controlled by strict permissions.
Data Sources
These are collections of raw unannotated communications data of a similar type, e.g. all emails from a shared mailbox, or a collection of NPS survey responses (see here for more detail). Individual data sources can be associated with up to 10 different datasets.
Datasets
These are comprised of 1 - 20 data sources (of similar type with similar intended purposes) and the 'model' that you create when you train the platform to understand the data in those sources (see here for more detail).
Projects
A permissioned storage area within the platform. Each dataset and data source belongs to a specific project, which is designated when they are created (see Projects for more details).
Tenants (Automation Cloud users only)
These allow you to model your organization structure, separating your business flows and information just like in real-life organizations. They are containers where you can organize your services and manage them for a group of users.
For example, you can create tenants for each of your departments and decide what services you want to enable for each, based on their needs. In each tenant, you can have one instance of each of the cloud services.
It is important to note that you cannot promote Communications Mining™ models between different UiPath® Cloud tenants (e.g.: promoting from DEV to PROD).
If you can only deploy to PROD in a PROD environment, then enable Communications Mining™ in PROD. However, if you have flexibility with deploying to PROD from another environment, you can have your PROD automation(s) call the platform from the tenant it sits in (e.g.: QA/DEV).
Permissions
These are per-user and specific to each project that a user belongs to. They can provide access to sensitive data and, depending on the permission, allow users to perform a range of different actions in the platform (see here for more detail).
Overview
If you are an Automation Cloud user, your Communications Mining™ service will be enabled on a specific tenant. Tenants are where projects are stored.
Each dataset and data source is associated with a specific project, with users requiring permissions in those projects to be able to work with the data within them.
Datasets in one project can be made up of data sources from another project. Users will just require permissions in both projects to view and annotate the data.
The below illustration helps to illustrate the relationship between these components and permissions:
- In the example below with Tenant A, all of the data sources are associated with Project A1, whilst there are datasets associated with both Project A1 and Project A2.
- If a user wanted to access datasets in Project A1 (i.e. dataset 1, 2 or 3), they would require viewing permissions for Project A1 only.
- But if a user wanted to access datasets in Project A2 (i.e. dataset 4, 5 or 6), they would require viewing permissions for both Projects A1 and A2, because the data sources are all located in Project A1.
- To view project A1 or A2, the user would require access to Tenant A. To view project B1, the user would require access to Tenant B. The user permissions do not transfer cross-tenant.
- The concept of having multiple cloud tenants is only applicable for Automation Cloud users.