Communications Mining user guide

Last updated Nov 10, 2025

Core concepts

This section provides an overview of the core platform concepts.

To learn more about the platform from an end-user perspective, check the Communications Mining™ user guide.

Concept	Description	Example
Source	In Communications Mining™, data is organized in data sources, or sources. Typically a source corresponds to a channel. An email mailbox, the results of a survey or a set of customer reviews are all examples of data that can be uploaded to Communications Mining as a data source. Multiple sources can be combined to build a model, so it is best to opt for the side of multiple sources rather than a single monolithic source.	The diagram shows email data (Source A which contains individual emails) and customer review data (Sources B and C which contain individual customer reviews). The customer review data is split into two sources based on the data origin, but will be combined into a single dataset for the purposes of building a common model.
Comment	Within sources, each individual piece of text communication is represented as a comment. A comment will always have an ID, timestamp, and text body, and additional fields based on what type of data it represents. For example, emails will have the expected email fields such as From, To, Cc, and so on.	The diagram shows how the available comment fields are used by the various comment types. For example, in an email comment the From field contains the sender address, while in a customer review comment it contains the review author. The metadata fields, shown at the bottom of each comment, are user-defined. Note how we use the same set of fields for both customer review sources: since we want to combine them into a single dataset, the data should be consistent in order to ensure good model performance.
Dataset	A dataset allows you to annotate one or more sources in order to build a model. A source can be included in multiple datasets. The set of all labels in a dataset is called a taxonomy.	The diagram shows two datasets built on top of the support mailbox data, and one dataset combining the customer review data. Note that even though Dataset 1 and Dataset 2 are based on the same data, their label taxonomy is different, because their use-cases, that is, analytics and automation, call for different sets of labels.
Model	The model is continuously updated as users annotate more data. In order to receive consistent predictions, a model version number needs to be specified when querying the model.
Label	Labels are applied when training a model, and are returned when querying the model for predictions. When labels are returned as predictions, they have an associated confidence score that indicates how likely the model thinks the prediction applies. To convert the prediction into a Yes or No answer, the confidence score needs to be checked against a threshold, which is chosen to represent a suitable precision/recall tradeoff.	Labels are assigned by Communications Mining users when training the model. The Communications Mining user interface helps the user annotate the most relevant comments, ensure that labels are applied consistently, and that enough comments are annotated to produce a well-performing model.

Was this page helpful?

PREVIOUSAPI docs

NEXTComments

Support and Services

Get The Help You Need

UiPath Academy

Learning RPA - Automation Courses

UiPath Forum

UiPath Community Forum

Trust and Security

Cookies Policy