- Understanding the data structure and permissions
- Create a data source in the GUI
- Uploading a CSV file into a source
- Create a new dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amend a dataset's settings
- Delete verbatims via the UI
- Delete a dataset
- Export a dataset
- Using Exchange Integrations
- Preparing Data for .CSV Upload
- Understanding labels, entities and metadata
- Label hierarchy and best practice
- Defining your taxonomy objectives
- Analytics vs. automation use cases
- Turning your objectives into labels
- Building your taxonomy structure
- Taxonomy design best practice
- Importing your taxonomy
- Overview of the model training process
- Generative Annotation (NEW)
- Understanding the status of your dataset
- Model training and labelling best practice
- Training with label sentiment analysis enabled
- Introduction to 'Refine'
- Precision and recall explained
- Precision and recall
- How does Validation work?
- Understanding and improving model performance
- Why might a label have low average precision?
- Training using 'Check label' and 'Missed label'
- Training using 'Teach label' (Refine)
- Understanding and increasing coverage
- Improving Balance and using 'Rebalance'
- When to stop training your model
Understanding the status of your dataset
Each time that you apply labels or review entities in your dataset, your model will retrain and a new model version is created. To understand more about using different model versions, see here.
When the model retrains, it takes the latest information it's been supplied with and recomputes all of its predictions across the dataset. This process begins when you start training and often when Communications Mining finishes applying the predictions for one model version, it is already recalculating the predictions for a newer model version. When you stop training after a period of time, Communications Mining will shortly catch up and apply the predictions that reflect the very latest training completed in the dataset.
This process can take some time, depending on the amount of training completed, the size of the dataset, and the number of labels in the taxonomy. Communications Mining has a helpful status feature to help users understand when their model is up to date, or if it is retraining and how long that is expected to take.
When you are in a dataset, one of these two icons at the top of the page will indicate its current status:
|This icon indicates that the dataset is up to date and the predictions from the latest model version have been applied.
|This indicates that the model is retraining and predictions may not be up to date.
If you hover over the icon with your mouse, you'll see more detail about the status as shown below: