activities
latest
false
UiPath logo, featuring letters U and I in white
Document Understanding Activities
Last updated Nov 26, 2024

OmniPage OCR

UiPath.OmniPage.Activities

Important: Handwriting recognition works only for hand-printed text, where the characters don’t have a connection. The ideal size for a document is between 25 and 45 pixels.

Description

Extracts a string and its information from an indicated UI element or image using OmniPage OCR Engine. The OmniPage OCR is an alternative to the other OCR engines, in all activities that require OCR engine implementations. Here are a few examples of activities that can be used together with the OmniPage OCR:Click OCR Text, Hover OCR Text, Double Click OCR Text, Get OCR Text, Find OCR Text Position, Digitize Document, CV Screen Scope, CV Get Text.

Note: The UiPath.OmniPage.Activities package, v1.9.0 or higher, has been upgraded to .NET5 core. This change implies having the .NET5 framework installed on your machine if the package is run within a non-Windows legacy workflow.
Note: The OmniPage OCR activity is compatible with the UiPath.IntelligentOCR.Activities package, v2.0.0 or higher and can be used in any OCR context.

Project compatibility

Windows-Legacy | Windows

Configuration

Properties panel

Common

  • DisplayName - The display name of the activity.

Input

  • Image - The image that you want to process. This field supports only Image variables.

Misc

  • Private - If selected, the values of variables and arguments are no longer logged at Verbose level.

Options

  • EnginePack - Specifies which embedded engine must be used for image processing. There are two options, as follows: Basic - Supports a wide range of languages, Extended - Contains extra support for Asian, Arabic, Thai, Hebrew and Vietnamese languages. Check the list of all available languages for the Basic pack at the end of this page.
    Note: In order to use the Extended engine, you must manually install the UiPath.OmniPage.Bundle.Extended package in the current project from the Package Manager.
  • ExtractWords - If selected, extracts the on-screen position of all detected words.
  • Language - The language used by the OCR engine. The default option is auto, meaning that the language is automatically detected. Multiple languages can be used separated by commas.
    Note:

    You can use settings for multiple languages at the same time. For example, you can set "eng,fra" to process images that contain both English as well as French content.

    Note that Japanese, Korean, and Chinese language settings call up a dedicated recognition engine. Only one of these languages should be selected at a time and not combined with any non-Asian language.

    Short embedded texts in English can be recognized without English being selected as a recognition language.

  • Profile - Choose a pre-processing profile for the specified image or UI element to achieve a better OCR read. The following options are available:
    • None - does not apply a pre-processing profile, this is the default option;
    • Screen - pre-processing suitable for remote desktop applications;
    • Scan - pre-processing suitable for scanned files;
    • Legacy - uses the engine's default settings for pre-processing images.
  • Scale - The scaling factor of the selected UI element or image. The higher the number is, the more you enlarge the image. This can provide a better OCR read and it is recommended with small images. If you want to scale down, values between 0 and 1 are also accepted. By default, the value is 1.
    Note: If you want to use this OCR activity from package UiPath.OmniPage.Activities v1.8.0 in Studio v19.10, install the UiPath.CoreIPC package, version 2.0.1 or higher.
    Important: Large-size images may result in an error when the scaling factor is higher than 1.

Output

  • Result - The text extracted by the OCR engine along with their on-screen position, stored in a KeyValuePair<Rectangle,String>. This field supports only KeyValuePair<Rectangle,String>.
  • Text - The text extracted by the OCR engine, stored in a String variable. This field supports only String variables.

Supported Languages

The following table shows a list of all the languages supported by the OmniPage OCR, as well as their corresponding language codes.

Table 1. Supported languages and language codes for OmniPage OCR - Basic Pack Languages
 

Language Code

Afrikaans

AFR

Albanian

SQI

Aymara

AYM

Basque

EUS

Bemba

BEM

Blackfoot

BLA

Brazilian

QBP

Breton

BRE

Bugotu

BGT

Bulgarian

BUL

Belorussian

BEL

Catalan

CAT

Chamorro

CHA

Chechen

CHE

Corsican

COS

Croatian

HRV

Crow

CRO

Czech

CES

Danish

DAN

Dutch

NLD

English

ENG

Eskimo (Inuit)

QES

Esperanto

EPO

Estonian

EST

Faroese

FAO

Fijian

FIJ

Finnish

FIN

French

FRA

Frisian

FRY

Friulian

FUR

Gaelic (Irish)

GLE

Gaelic (Scottish)

GLA

Galician

GLG

Ganda

LUG

German

DEU

Greek

ELL

Guarani

GRN

Hani *

HNI

Hawaiian

HAW

Hungarian

HUN

Icelandic

ISL

Ido

IDO

Indonesian

IND

Interlingua

INA

Italian

ITA

Kasub

CSB

Kawa *

WBM

Kikuyu

KIK

Kongo

KON

Kpelle

KPE

Kurdish *

KUR

Latin

LAT

Latvian

LAV

Lithuanian

LIT

Luba

LUA

Luxembourgish

LTZ

Macedonian

MKD

Malagasy

MLG

Malay

MSA

Malinke

MLQ

Maltese

MLT

Maori

MRI

Mayan

MYN

Miao *

HMN

Minangkabau

MIN

Mohawk

MOH

Moldavian

MOL

Nahuatl

NAH

Norwegian

NOR

Nyanja

NYA

Occidental

OCC

Ojibway

OJI

Papiamento

PAP

Pidgin English

TPI

Polish

POL

Portuguese

POR

Provençal

PRV

Quechua

QUE

Rhaetic

ROH

Romanian

RON

Romany

ROM

Rwanda

KIN

Rundi

RUN

Russian

RUS

Sami

SMI

Sami, Lule

SMJ

Sami, Northern

SME

Sami, Southern

SMA

Samoan

SMO

Sardinian

SRD

Serbian

SRP

Serbian, Latinic

QSL

Shona

SNA

Sioux

DAK

Slovak

SLK

Slovenian

SLV

Somali

SOM

Sorbian (Wend)

WEN

Sotho

SOT

Spanish

SPA

Sundanese

SUN

Swahili

SWA

Swazi

SSW

Swedish

SWE

Tagalog

TGL

Tahitian

TAH

Pirez

QTI

Tongan

TON

Tswana (Chuana)

TSN

Tun *

TUG

Turkish

TUR

Ukrainian

UKR

Visayan

QIS

Welsh

CYM

Wolof

WOL

Xhosa

XHO

Zapotec

ZAP

Zulu

ZUL

= This language can be handled only if it is written using the Latin alphabet.

 
Table 2. Supported languages and language codes for OmniPage OCR - Extended Pack Languages
 

Language Code

All languages included in the Basic Pack

 

Japanese

JPN

Simplified Chinese

QCS

Traditional Chinese

QCT

Korean

KOR

Thai

THA

Arabic

ARA

Hebrew

HEB

Vietnamese (Latin)

VIE

Note: The Extended Pack can be used with the ISO/DIS 639-3 language codes mentioned above, with the ISO 639-1 and ISO 639-2 language codes, or with the actual name of the language.
  • Description
  • Project compatibility
  • Configuration
  • Supported Languages

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.