OCR engines are used for the following purposes:
- At data labeling time, when importing documents into Data Manager. The engines available for this step are UiPath Document OCR (free in cloud or on premises), Google Cloud OCR (cloud only), Microsoft Read OCR (cloud or on premises) and Omnipage (on premises only).
- At run time, when calling models from RPA workflows. The engines available for this step are all the engines integrated with the UiPath RPA platform including the above, plus Abbyy Finereader, Microsoft OCR (legacy), Microsoft Project Oxford OCR and Tesseract,
In production, we recommend calling the OCR using the Digitize Document activity in your workflow and passing the Document Object Model as input to the activity calling the ML model.
For this purpose you may use the Machine Learning Extractor activity (Official feed) or the Extract Semi-Structured Document activity (Connect feed).
As a quick convenience for testing purposes, you can also configure the OCR directly in AI Fabric (Settings window**), but this is not recommended for production deployments.
UiPath OCR is a proprietary OCR technology of UiPath, supporting characters used by the following Latin script languages: English, French, German, Italian, Portuguese, Romanian and Spanish. Text in other languages will be recognized but without accents, for instance “Ł” in Polish will be recognized as “L”. Pages processed using UiPath OCR are not counted towards the page quota purchased along with the Document Understanding Enterprise license so UiPath OCR is free to use.
UiPath OCR is available both on premises as a docker container and in the cloud as a cloud service API with the URL: https://du.uipath.com/ocr. See the full description of the available URLs on the Licenses page.