The supported languages for different Document Understanding components can be found in the table below.
| Taxonomy Manager|
Keyword Based Classifier
Intelligent Keyword Classifier
Machine Learning Classifier
RegEx Based Extractor
Machine Learning Extractor
|The left-to-right languages supported by the OCR engine of choice:|
Supported languages by UiPath Document OCR here.
Supported languages by Omnipage OCR here.
Supported languages by other 3rd party vendors (Google, Abbyy, Microsoft): please check the vendor's website for the most up-to-date information.
Right-to-left languages are not supported even if the OCR engine supports them.
|Intelligent Form Extractor||Same as above, except for Handwriting Recognition which supports only English.|
|Pre-built ML models||Please refer to ML Packages Supported Languages.|
For the supported languages, retraining may be required to get the expected accuracy if the documents are considerably different from the original model training dataset.
For the languages not supported in this list, you can create a custom ML model that can extract any left-to-right language, assuming the OCR engine supports it as well.
Automatic reformatting of dates in a standard yyyy-mm-dd format for Asian languages is currently supported only for Japanese. For documents in other Asian languages, you can extract the dates as String content type and format it in the RPA workflow.
Updated 19 days ago