Document Understanding - DocumentClassifier

document-understanding

latest

false

Document Understanding classic user guide

DocumentClassifier - ML package

The DocumentClassifier ML package is a generic, retrainable model for classifying structured and semi-structured documents, requiring training before deployment.

DocumentClassifier is a generic, retrainable model for classifying any type of structured or semi-structured documents, building a model from scratch.

This ML Package must be trained. If deployed without training first, deployment fails with an error stating that the model is not trained.

You can train the DocumentClassifier model by exporting it from Document Manager to the AI Center session.

Once the model is trained, you can use it along with the following activities: Machine Learning Classifier, or Machine Learning Classifier Trainer.

Important:

The Document Type Classifier does not support documents written in non-Latin alphabets, such as Hebrew, Chinese or Japanese. When document type classification is used with such documents, pipelines may fail or produce unexpected results, including encoding-related errors. Document extraction can still work with non-Latin languages if classification is not used.

Was this page helpful?

PREVIOUSDocument Understanding - ML package

NEXTML packages with OCR capabilities