Document Understanding User Guide

DELIVERY:

Last updated Apr 4, 2025

Deploy UiPathDocumentOCR

Create a UiPathDocumentOCR ML package in AI Center.

For online installation, the UiPathDocumentOCR model is already included in the Out of the box packages section. Go to ML Packages > Out of the box packages > UiPath Document Understanding > UiPathDocumentOCR, and click Submit.

For offline installation, go to the ML Packages tab from the left sidebar of AI Center and create a new package. Name the package and upload the package that you have downloaded from this page. Choose JSON input type, and the corresponding Python language. Create package.

Note: When creating a UiPathDocumentOCR ML Package in AI Center, it cannot be named ocr or OCR. Make sure to choose another name.

Go to ML Skills and create a new ML skill for the UiPathDocumentOCR package you created.

Please use Advanced Infra Settings to update the deployment to update the replica (the number of replica should ideally be equal to the number of nodes) and maximize the CPU (at least 4) and RAM requests if you are not using GPU machines, or the UiPathDocumentOCR processing will be slow and may fail.

The OCR engine needs GPU for optimal performance, and it is recommended for production workloads. However, if GPU is not available, it can still run on CPU, but it requires higher resources than the default. Advanced infra settings should be adjusted as such:

Replicas: increase if there is concurrent usage of UiPathDocumentOCR. If you are using UiPathDocumentOCR to do imports on a single Data Labeling session at a time and the UiPathDocumentOCR is not used in other UiPath workflows then 1 replica suffices. Otherwise, the number of replicas needs to be increased. There is no "magic" number here, you need some trial and error. Do not use more than 2 replicas on a single node installation. Ideally, replica count should equal the number of nodes in the cluster (1 replica/node). If more parallelism is needed, increasing the CPU helps
CPU: it should be at least 4 (for each replica). Make sure you have appropriate resources. There is no "magic" number, but more CPU means faster processing time. You need to test under your specific scenarios what is enough.

It can take up to 30 minutes for the ML Skill to be ready. You may need to refresh the AI Center page to see the status change. Once the ML Skill is available, double-click the ML Skill and go to Modify current deployment.