- Getting Started
- Framework Components
- ML Packages
- Pipelines
- Data Manager
- OCR Services
- OCR Services
- Document Understanding deployed in Automation Suite
- Document Understanding deployed in AI Center standalone
- Deep Learning
- Licensing
- References
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.DocumentProcessing.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities
Document Understanding User Guide
OCR Services
OCR services are used for the following purposes:
- At data labeling time, when importing documents into Data Manager. The engines available for this step are UiPath Document OCR, Google Cloud Vision OCR, and Microsoft Read OCR.
- At run time when calling models from RPA workflows. The engines available for this step are all the engines integrated with the UiPath RPA platform including the above, plus Abbyy Finereader, Microsoft OCR (legacy), Microsoft Project Oxford OCR, and Tesseract.
In production, we recommend calling the OCR using the Digitize Document activity in your workflow and passing the Document Object Model as input to the activity calling the ML model. For this purpose, you need to use the Machine Learning Extractor activity (Official feed).
As a quick convenience for testing purposes, you can also configure the OCR directly in AI Center (Settings window), but this is not recommended for production deployments.
This section details the hardware and software requirements for installing OCR Engines.
-
Machines Involved : VM in the Cloud/On-Prem Box/Laptop
-
Operating Systems: Windows (Windows 10)/Linux (Ubuntu/RedHat)
-
Computing Engines: CPU/GPU
-
OCR: UiPath Document OCR CPU/UiPath Document OCR GPU
|
CPU Cores |
RAM (GB) |
Video RAM (GB) | HDD (GB) |
---|---|---|---|---|
UiPath CPU |
4 |
4 |
50 | |
UiPath GPU |
1 |
4 |
8 |
50 |
Linux Operating System
If you install the product on a VM in the cloud, the following operating systems are supported:
Software |
Versions |
---|---|
Ubuntu |
20.04 LTS 18.04 LTS 16.04 LTS |
RHEL |
7.x |
If you install the product on a machine in an on-premises data center, the following operating systems are supported:
Software |
Versions |
---|---|
Ubuntu |
20.04 LTS 18.04 LTS 16.04 LTS |
RHEL |
7.x |
Windows Operating System
See the official Docker website for the list of Windows operating systems supported.
On Windows, your machine requires virtualization enabled. We strongly recommend this be done only on physical machines like laptops or desktop workstations. We do not support running on Docker on Windows in Virtual Machines (Cloud or Datacenter) using Nested Virtualization.
Browsers
Software |
Versions |
---|---|
Google Chrome |
50+ |
-
Data Manager needs access to OCR engine
<IP>:<port_number>
. OCR engine might be UiPath Document OCR on-premises, Google Cloud Vision OCR, Microsoft Read Azure, Microsoft Read on-premises. -
Robots need access to OCR
<IP>:<port_number>
. Same OCR options as above. -
OCR engines need access to the Licensing server hosted by UiPath in Azure, on port 443.
If you only want to serve pre-trained out-of-the-box models, you can run an OCR engine on your Windows 10 laptop. Make sure Docker Desktop has 8G of RAM available.
If you want to try training a custom model as a demo on a small volume of data (under 100 documents), you can run the OCR Engine on an environment with a limit of 4GB of RAM. For small cases like this, a GPU for the OCR engine may not be necessary.
OCR Engines are containerized applications that run on top of docker. You cannot run these on the same machine as AI Center on-premises. To run them on a separate machine, the prerequisites installer commands below can be used to set up docker and optionally the NVidia drivers. These scripts should not be run on the machine where AI Center will be installed.
/
in the rightmost column:
df -h
df -h
If the size of that partition is smaller than the minimal storage requirements, then see the Configuring the Docker Data Folder section.
Linux
Follow instructions in the official Docker documentation, or run this command:
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env cpu
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env cpu
If this command fails, then you have an incompatible Linux operating system and you need to request your IT to install Docker on the machine following the instructions in the official Docker documentation.
Azure VMs
If you are installing on a VM in Azure, then use this command instead:
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env cpu --cloud azure
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env cpu --cloud azure
Windows 10
Download and install Docker Desktop. On recently updated versions of Windows 10, you will need WSL2 installed. So when presented with a dialog saying "WSL 2 Installation is Incomplete" please click the Restart button.
workdir
for Data Manager) and include the path to it in the docker run command, after the -v
flag. When doing this on Windows, Docker Desktop will pop up a notification like the one below. You need to click on Share it to proceed.
Fill in the path to the folder where you want Docker to hold its files, then run this command and then reboot:
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --change-mount </path/to/folder>
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --change-mount </path/to/folder>
Docker helps ship software in Docker “images. A running instance of an image is called a container. A container can be stopped, removed, started again, as many times as needed, as long as the image is available.
Once the image is removed, it is lost. The only way to recover it is to pull it again from the registry it came from if it is still available there.
–v
and –p
arguments, respectively.
In the table below you can find a list of common commands for the Docker command line.
Click here for the full list of base Docker commands.
Command |
Description |
---|---|
|
Log in to a registry. |
|
Download an image from a registry. The tag latest is commonly used to refer to the latest version of an image. |
OR
|
Run an image in detached mode, while mapping port 80 from inside the container to port 5000 on the host machine, and <container folder> to <host folder>. Detached mode means the container does not block the terminal, so you can perform other operations on the same terminal. |
|
List images present on your system. |
|
List all containers (both running and stopped). The container id is used to refer to that container when one needs to stop it or remove it, for instance. |
|
Stop the container This command does not remove the container, but is required in advance to removing it. |
"
docker rm <container id> "
|
Remove the container The container must be stopped beforehand. |
|
Display the logs of the container. |
|
Remove one or more images from the system. This helps save storage space as images can take up a lot of space. |
|
Remove all stopped containers |
Command |
Description |
---|---|
|
Run a command as administrator. Try this whenever you get a Permission Denied error. |
|
Display information about the network interfaces in your system. Find the IP of your machine in the eth0 or docker0 sections. |
|
Display the path to the current folder. |
|
List the content of a directory. |
|
Go to a different folder. |
|
Create a new folder. |
Linux
Run this command:
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env gpu
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env gpu
On some systems running the command twice or a system reboot might be required to install all requirements.
Azure Specific: To use the NV-series virtual machines you need to either install the NVIDIA driver before executing the above command, or you can use a Driver Extension from Azure to install the necessary NVIDIA driver according to that tier GPU model.
Azure VMs
If you are installing on a VM in Azure, then use this command instead:
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env gpu --cloud azure
curl -fsSL https://raw.githubusercontent.com/UiPath/Infrastructure/master/ML/du_prereq_installer.sh | sudo bash -s -- --env gpu --cloud azure
UiPath Document OCR is a proprietary OCR technology of UiPath, supporting characters used by the following Latin script languages: English, French, German, Italian, Portuguese, Romanian, and Spanish. Text in other languages will be recognized but without accents, for instance, “Ł” in Polish will be recognized as “L”. Pages processed using UiPath Document OCR are not counted towards the page quota purchased along with the Document Understanding Enterprise license so UiPath Document OCR is free to use.
UiPath Document OCR is available with the following deployment types:
- cloud public URLs - more details on the Public Endpoints page
- on-premises (including air-gapped) using UiPath.DocumentUnderstanding.OCR.LocalServer activity package (does not require Internet access)
- on-premises regular standalone docker container (requires Internet access)
- on-premises air-gapped standalone docker container (does not require Internet access)
- on-premises as ML Skill in AI Center regular deployment (requires Internet access)
- on-premises as ML Skill in AI Center air-gapped deployment (does not require Internet access)
-
To install UiPath Document OCR standalone docker container, run these commands:
docker login aiflprodweacr.azurecr.io -u *** -p **docker pull aiflprodweacr.azurecr.io/uipath-ocr:latest
docker login aiflprodweacr.azurecr.io -u *** -p **docker pull aiflprodweacr.azurecr.io/uipath-ocr:latest -
Run using CPUs
docker run -d -p 5000:80 aiflprodweacr.azurecr.io/uipath-ocr:latest LicenseAgreement=accept
docker run -d -p 5000:80 aiflprodweacr.azurecr.io/uipath-ocr:latest LicenseAgreement=accept -
Run using GPU
docker run -d -p 5000:80 --gpus all aiflprodweacr.azurecr.io/uipath-ocr:latest LicenseAgreement=accept
docker run -d -p 5000:80 --gpus all aiflprodweacr.azurecr.io/uipath-ocr:latest LicenseAgreement=accept -
In AI Center, when creating a new ML Package, at the bottom of the screen there is the optional OCR configuration section where you can define the server side OCR Engine type, the OCR URL, and the OCR Key. The OCR Key is the API Key you obtain from the Licenses section of your Automation Cloud account. This is the OCR configuration which will be used by the MachineLearning Extractor activity if you check the "UseServerSideOCR" box. This box is unchecked by default, and in that case the extractor will use the OCR in the Digitize Document activity.
Important: UiPath Document OCR container cannot run on the same machine as AI Center On-Premises.
The endpoint can be obtained from the Google Cloud Platform documentation. The ApiKey can be obtained from your Google Cloud Platform Console if you have a Google Cloud Vision service in your subscription.
The table below shows how to configure the six supported OCR engine types in both Data Manager and AI Center.
OCR Engine |
OCR Method |
OCR Key |
OCR URL |
---|---|---|---|
UiPath |
UiPath Document OCR |
UiPath Automation Cloud Document Understanding API Key Enterprise Plan |
|
|
Google Cloud Vision OCR |
GCP Console API Key |
|
Microsoft Read 2.0 On-Prem |
Microsoft Read OCR |
None |
|
Microsoft Read 2.0 Azure |
Microsoft Read OCR |
API Key for your resource from Azure Portal |
|
Microsoft Read 3.2 On-Prem |
Microsoft Read OCR |
None |
|
Microsoft Read 3.2 Azure |
Microsoft Read OCR |
API Key for your resource from Azure Portal |
|
- About OCR Services
- Requirements
- Hardware Requirements
- Software Requirements
- Network Configuration
- Minimal Trial or Proof-of-Concept Configuration
- Prerequisites
- Installing Docker
- Configuring the Docker Data Folder (Linux Only)
- Docker Cheat Sheet
- Linux Terminal Cheat Sheet
- (Optional) GPU Machine Install
- Installation
- UiPath Document OCR
- Google Cloud OCR
- Microsoft Read
- Configuring OCR Service in Data Manager and AI Center Document Understanding ML Packages