Communications Mining user guide

Last updated Apr 15, 2026

Balance

Balance is a term used to describe how well the training data for a model represents the dataset as a whole.

When the platform assesses how balanced a model is, it looks for annotating bias that can cause an imbalance between the training data and the dataset as a whole.

To do this, the platform uses an annotating bias model that compares the reviewed and unreviewed data to ensure that the annotated data is representative of the whole dataset. If the data is not representative, model performance measures can be misleading and potentially unreliable.

Annotating bias is typically the result of an imbalance of the training modes used to assign labels, particularly if too much text search is used and not enough shuffling.

The Rebalance training mode shows messages that are under-represented in the reviewed set. Annotating examples in this mode will help to quickly address any imbalances in the dataset.

Was this page helpful?

PREVIOUSKey concepts and terminology

NEXTClusters