Document Understanding
latest
false
- Overview
- Getting Started
- Activities
- Insights Dashboards
- Document Understanding Process
- ML Packages
- Overview
- 1040 - ML Package
- 1040 Schedule C - ML Package
- 1040 Schedule D - ML Package
- 1040 Schedule E - ML Package
- 4506T - ML Package
- 990 - ML Package - Preview
- ACORD125 - ML Package
- ACORD126 - ML Package
- ACORD131 - ML Package
- ACORD140 - ML Package
- ACORD25 - ML Package
- Bank Statements - ML Package
- BillsOfLading - ML Package
- Certificate of Incorporation - ML Package
- Certificate of Origin - ML Package
- Checks - ML Package
- Children Product Certificate - ML Package
- CMS 1500 - ML Package
- EU Declaration of Conformity - ML Package
- Financial Statements - ML Package
- FM1003 - ML Package
- I9 - ML Package
- ID Cards - ML Package
- Invoices - ML Package
- Invoices Australia - ML package
- Invoices China - ML package
- Invoices India - ML package
- Invoices Japan - ML package
- Invoices Shipping - ML Package
- Packing Lists - ML Package
- Payslips - ML Package
- Passports - ML Package
- Purchase Orders - ML Package
- Receipts - ML Package
- RemittanceAdvices - ML Package
- UB04 - ML Package
- Utility Bills - ML Package
- Vehicle Titles - ML Package
- W2 - ML Package
- W9 - ML Package
- Public Endpoints
- Licensing
PREVIEW
Document Understanding User Guide for Modern Experience
Last updated Apr 26, 2024
OCR
Supports Taxonomy Manager, Digitization, Keyword Based Classifier, Intelligent Keyword
Classifier, ML Classifier, RegEx Based Extractor, Form Extractor, ML Extractor,
Validation Station.
Tip: Choosing the right OCR engine
for your documents is simple. By default, use the UiPath Document OCR, which
receives regular updates and improvements. If this doesn't support your document
language or it's not performing well, switch to one of our other OCR engines, like
the UiPath Extended Languages OCR.
Language (Language Code) | UiPath Document OCR | UiPath Document OCR CPU | UiPath Extended Languages OCR (Preview) | Chinese, Japanese, Korean OCR |
---|---|---|---|---|
Adyghe (ADY) | ||||
Afar (AA) | ||||
Afrikaans (AFR) | ||||
Akan (AK) | ||||
Albanian (SQI) | ||||
Algonquin (ALQ) | ||||
Angika (Devanagari) (ANP) | ||||
Arabic (ARA) | (Preview) | (Preview) | ||
Asturian (AST) | ||||
Asu (ASA) | ||||
Avaric (AV) | ||||
Awadhi-Hindi (Devanagari) (AWA) | ||||
Aymara (AYM) | ||||
Azerbaijani (Latin) (AZ) | ||||
Bafia (KSF) | ||||
Bagheli (BFY) | ||||
Bambara (BM) | ||||
Bashkir (BA) | ||||
Basque (EU) | ||||
Belarusian (Cyrilic) (BE, BE-CYRL) | ||||
Belarusian (Latin) (BE, BE-LATN) | ||||
Bemba (BEM) | ||||
Bena (BEZ) | ||||
Bhojpuri-Hindi (Devanagari) (BHO) | ||||
Bikol (BIK) | ||||
Bislama (BI) | ||||
Bodo (Devanagari) (BRX) | ||||
Bosnian (Latin) (BS) | ||||
Brajbha (BRA) | ||||
Breton (BR) | ||||
Bulgarian (BG) | ||||
Bundeli (BNS) | ||||
Buryat (Cyrilic) (BUA) | ||||
Catalan (CA) | ||||
Cebuano (CEB) | ||||
Chamling (RAB) | ||||
Chamorro (CH) | ||||
Chechen (CE) | ||||
Chhattisgarhi (Devanagari) (HNE) | ||||
Chiga (CGG) | ||||
Chinese - Simplified (ZH-Hans) | ||||
Chinese - Traditional (Hant) | ||||
Choctaw (CHO) | ||||
Chukot (CKT) | ||||
Chuvash (CV) | ||||
Cornish (KW) | ||||
Corsican (CO) | ||||
Cree (CR) | ||||
Creek (MUS) | ||||
Crimean Tatar (Latin) (CRH) | ||||
Croatian (HR) | ||||
Crow (CRO) | ||||
Czech (CS) | ||||
Danish (DA) | ||||
Dargwa (DAR) | ||||
Dari (PRS) | ||||
Dhimal (Devanagari) (DHI) | ||||
Dogri (Devanagari) (DOI) | ||||
Duala (DUA) | ||||
Dungan (DNG) | ||||
Dutch (NL) | ||||
Efik (EFI) | ||||
English (EN) | ||||
Erzya (Cyrilic) (MYV) | ||||
Estonian (ET) | ||||
Faroese (FO) | ||||
Fijian (FJ) | ||||
Filipino (FIL) | ||||
Finnish (FI) | ||||
Fon (FON) | ||||
French (FR) | ||||
Friulian (FUR) | ||||
Ga (GAA) | ||||
Gaelic - Irish (GA) | ||||
Gaelic - Scottish (GD) | ||||
Gagauz (Latin) (GAG) | ||||
Galician (GL) | ||||
Ganda (LG) | ||||
Gayo (GAY) | ||||
German (DE) | ||||
Gilbertese (GIL) | ||||
Gondi (Devanagari) (GON) | ||||
Greek (EL) | ||||
Greenlandic (KL) | ||||
Guarani (GN) | ||||
Gurung (Devanagari) | ||||
Gusii (GUZ) | ||||
Haitian Creole (HT) | ||||
Halbi (Devanagari) (HLB) | ||||
Hani (HNI) | ||||
Haryanvi (BGC) | ||||
Hawaiian (HAW) | ||||
Hebrew (HE) | ||||
Herero (HZ) | ||||
Hiligaynon (HIL) | ||||
Hindi (HI) | ||||
Hmong Daw (Latin) (MWW) | ||||
Ho (Devanagari) (HOC) | ||||
Hungarian (HU) | ||||
Iban (IBA) | ||||
Icelandic (IS) | ||||
Igbo (IG) | ||||
Iloko (ILO) | ||||
Inari Sami (SMN) | ||||
Indonesian (ID) | ||||
Ingush (INH) | ||||
Interlingua (IA) | ||||
Inuktitut (Latin) (IU) | ||||
Italian (IT) | ||||
Japanese (JA) | ||||
Jaunsari (Devanagari) (JNS) | ||||
Javanese (JV) | ||||
Jola-Fonyi (DYO) | ||||
Kabardian (KBD) | ||||
Kabuverdianu (KEA) | ||||
Kachin (Latin) (KAC) | ||||
Kalenjin (KLN) | ||||
Kalmyk (XAL) | ||||
Kangri (Devanagari) (XNR) | ||||
Kanuri (KR) | ||||
Karachay-Balkar (KRC) | ||||
Kara-Kalpak (Cyrilic) (KAA-CYR) | ||||
Kara-Kalpak (Latin) (KAA) | ||||
Kashubian (CSB) | ||||
Kazakh (Cyrilic) (KK-CYR) | ||||
Kazakh (Latin) (KK-LATN) | ||||
Khakas (KJH) | ||||
Khaling (KLR) | ||||
Khasi (KHA) | ||||
K'iche' (QUC) | ||||
Kikuyu (KI) | ||||
Kildin Sami (SJD) | ||||
Kinyarwanda (RW) | ||||
Komi (KV) | ||||
Kongo (KN) | ||||
Korean (KO) | ||||
Korku (KFQ) | ||||
Koryak (KPY) | ||||
Kosraean (KOS) | ||||
Kpelle (KPE) | ||||
Kuanyama (KJ) | ||||
Kumyk (Cyrilic) (KUM) | ||||
Kurdish (Arabic) (KU-ARAB) | ||||
Kurdish (Latin) (KU-LATN) | ||||
Kurukh (Devanagari) (KRU) | ||||
Kyrgyz (Cyrilic) (KY) | ||||
Lak (LBE) | ||||
Lakota (LKT) | ||||
Latin (LA) | ||||
Latvian (LV) | ||||
Lezghian (LEX) | ||||
Lingala (LN) | ||||
Lithuanian (LT) | ||||
Lower Sorbian (DSB) | ||||
Lozi (LOZ) | ||||
Lule Sami (SMJ) | ||||
Luo (Kenya and Tanzania) (LUO) | ||||
Luxembourgish (LB) | ||||
Luyia (LUY) | ||||
Macedonian (MK) | ||||
Machame (JMC) | ||||
Madurese (MAD) | ||||
Mahasu Pahari (Devanagari) (BFZ) | ||||
Makhuwa-Meetto (MGH) | ||||
Makonde (KDE) | ||||
Malagasy (MG) | ||||
Malay (Latin) (MS) | ||||
Maltese (MT) | ||||
Malto (Devanagari) (KMJ) | ||||
Mandinka (MNK) | ||||
Manx (GV) | ||||
Maori (MI) | ||||
Mapundungun (ARN) | ||||
Marathi (MR) | ||||
Mari (Russia) (CHM) | ||||
Masai (MAS) | ||||
Mende (Sierra Leone) (MEN) | ||||
Meru (MER) | ||||
Meta' (MGO) | ||||
Minangkabau (MIN) | ||||
Mohawk (MOH) | ||||
Mongolian (Cyrilic) (MN) | ||||
Mongondow (MOG) | ||||
Montenegrin (Cyrilic) (CNR-CYRL) | ||||
Montenegrin (Latin) (CNR-LATN) | ||||
Morisyen (MFE) | ||||
Mundang (MUA) | ||||
Nahuatl (NAH) | ||||
Navajo (NV) | ||||
Ndonga (NG) | ||||
Neapolitan (NAP) | ||||
Nepali (NE) | ||||
Ngomba (JGO) | ||||
Niuean (NIU) | ||||
Nogay (NOG) | ||||
North Ndebele (ND) | ||||
Northern Sami (Latin) (SME) | ||||
Norwegian (NO) | ||||
Nyanja (NY) | ||||
Nyankole (NYN) | ||||
Nzima (NZI) | ||||
Occitan (OC) | ||||
Ojibway (OJ) | ||||
Oromo (OM) | ||||
Ossetic (OS) | ||||
Pampanga (PAM) | ||||
Pangasinan (PAG) | ||||
Papiamento (PAP) | ||||
Pashto (PS) | ||||
Pedi (NSO) | ||||
Persian (FA) | ||||
Polish (PL) | ||||
Portuguese (PT) | ||||
Punjabi (Arabic) (PA) | ||||
Quechua (QU) | ||||
Ripurian (KSH) | ||||
Romanian (RO) | ||||
Romansh (RM) | ||||
Rundi (RN) | ||||
Russian (RU) | ||||
Rwa (RWK) | ||||
Sadri (Devanagari) (SCK) | ||||
Sakha (SAH) | ||||
Samburu (SAQ) | ||||
Samoan (Latin) (SM) | ||||
Sango (SG) | ||||
Sangu (Gabon) | ||||
Sanskrit (Devanagari) (SA) | ||||
Santali (Devanagari) (SAT) | ||||
Scots (SCO) | ||||
Sena (SEH) | ||||
Serbian (Cyrilic) (SR-CYRL) | ||||
Serbian (Latin) (SR, SR-LATN)) | ||||
Shambala (KSB) | ||||
Shona (SN) | ||||
Siksika (BLA) | ||||
Sirmauri (Devanagari) (SRX) | ||||
Skolt Sami (SMS) | ||||
Slovak (SK) | ||||
Slovenian (SL) | ||||
Soga (XOG) | ||||
Somali (Arabic) (SO) | ||||
Somali (Latin) (SO-LATN) | ||||
Songhai (SON) | ||||
South Ndebele (NR) | ||||
Southern Altai (ALT) | ||||
Southern Sami (SMA) | ||||
Southern Sotho (ST) | ||||
Spanish (ES) | ||||
Sundanese (SU) | ||||
Swahili (Latin) (SW) | ||||
Swati (SS) | ||||
Swedish (SV) | ||||
Tabassaran (TAB) | ||||
Tachelhit (SHI) | ||||
Tahitian (TY) | ||||
Taita (DAV) | ||||
Tajik (Cyrilic) (TG) | ||||
Tamil (TA) | ||||
Tatar (Cyrilic) (TT-CYRL) | ||||
Tatar (Latin) (TT) | ||||
Teso (TEO) | ||||
Tetum (TET) | ||||
Thai (TH) | ||||
Thangmi (THF) | ||||
Tok Pisin (TPI) | ||||
Tongan (TO) | ||||
Tsonga (TS) | ||||
Tswana (TN) | ||||
Turkish (TR) | ||||
Turkmen (Latin) (TK) | ||||
Tuvan (TYV) | ||||
Udmurt (UDM) | ||||
Uighur (Cyrilic) (UG-CYRL) | ||||
Ukranian (UK) | ||||
Upper Sorbian (HSB) | ||||
Urdu (UR) | ||||
Uyghur (Arabic) (UG) | ||||
Uzbek (Arabic) (UZ-ARAB) | ||||
Uzbek (Cyrilic) (UZ-CYRL) | ||||
Uzbek (Latin) (UZ) | ||||
Vietnamese (VI) | ||||
Volapuk (VO) | ||||
Vunjo (VUN) | ||||
Walser (WAE) | ||||
Welsh (CY) | ||||
Western Frisian (FY) | ||||
Wolof (WO) | ||||
Xhosa (XH) | ||||
Yucatec Maya (YUA) | ||||
Zapotec (ZAP) | ||||
Zarma (DJE) | ||||
Zhuang (ZA) | ||||
Zulu (ZU) |
Arabic characters | 'ا','ب','ة','ت','ث','ج','ح','خ','د','ذ','ر','ز','س','ش','ص','ض','ط','ظ','ع','غ','ـ','ف','ق','ك','ل','م','ن','ه','و','ى','ي','ٓ','ٔ','ٕ','٠','١','٢','٣','٤','٥','٦','٧','٨','٩','٪','٫','٬','٭','ٱ','۔','ً','ٌ','ٍ','َ','ُ','ِ','ّ','ْ','ٰ','ۥ','ۦ','آ','،','؛','؟','ء','أ','ؤ','إ','ئ' |
Supported OCR characters | ! " # $ % & \ ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ \ ] ^ _ a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ £ ¥ § © ® ° ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý ß à á â ã ä å æ ç è é ê ë ì í î ï ñ ò ó ô õ ö ø ù ú û ü ý Ā ā Ă ă Ą ą Ć ć Ċ ċ Č č Ď ď Đ đ Ē ē Ė ė Ę ę Ě ě Ğ ğ Ġ ġ Ħ ħ Ī ī Ĭ ĭ Į į İ ı Ĺ ĺ Ľ ľ Ł ł Ń ń Ň ň Ŋ ŋ Ō ō Ő ő Œ œ Ŕ ŕ Ř ř Ś ś Š š Ť ť Ŧ ŧ Ū ū Ŭ ŭ Ů ů Ų ų Ź ź Ż ż Ž ž Ə Ǵ ǵ Ș ș Ț ț ə μ א ב ג ד ה ו ז ח ט י ך כ ל ם מ ן נ ס ע ף פ ץ צ ק ר ש ת ₪ € ≤ ≥ |