Enables data extraction from documents using machine learning models provided by UiPath. This activity can be used only within the Data Extraction Scope activity.
The UiPath.DocumentUnderstanding.ML.Activities package is an improved version of the obsolete UiPath.MachineLearningExtractor.Activities package. Avoid any possible errors caused by the name difference by removing any prior installation of this package and replacing it with the new one.
Some limitations are in place for the Community package versions:
The size of the documents is limited at 2 pages and 4MB in total.
Community endpoints are rate-limited per IP address at 50 requests per hour. If the rate-limit is reached, an
429 - Too Many Requests error is displayed, and the IP address is blocked for 1 hour.
Click here for information about the Document Understanding Endpoints.
- DisplayName - The display name of the activity.
- Private - If selected, the values of variables and arguments are no longer logged at Verbose level.
- ApiKey - The API key used to provide you access to the Machine Learning Extractor.
- Endpoint - The endpoint hosting the machine learning model.
- RetryOnFailure - Automatically retries the machine learning model execution, to eliminate transient network errors. If checked, the activity retries the execution.
- MLSkill - Provides the MLSkills list available in the AI Fabric service.
- UseServerSideOCR - If not selected, the machine learning service uses the OCR results received from digitization. If selected, the document is reprocessed using the internally configured OCR. The default value is
The UseServerSideOCR option is available starting with UiPath.DocumentUnderstanding.ML.Activities v1.1.0. Previous package versions execute OCR server-side.
This activity can work either with an Endpoint or with the ML Skill option set. If both options are set, then the following error is displayed either in the configuration window or in the wizard:
You can add the ML Skill option to your activity by selecting one from the available list:
Below are the steps that you need to follow for using the Machine Learning Extractor activity.
Images with a resolution lower than 50 x 50 pixels cannot be processed, generating an error.
- Use the Taxonomy Manager Wizard to define your document type, with the fields you are targeting for data extraction.
- Drag a Machine Learning Extractor in a Data Extraction Scope activity.
- In the Machine Learning Extractor wizard that automatically opens, add the Endpoint information.
- Select the check box for the Update activity arguments if you wish to also use the entered values as input arguments for the activity, more precisely for the Endpoint.
- Click the Get Capabilities button. The wizard closes after this operation.
- Select the Configure Extractors option of the Data Extraction Scope. A wizard is displayed.
- The Machine Learning Extractor is now ready for configuration. Expand the document type that you want to apply it for, and start selecting the fields you want to attempt extraction for, by checking the check boxes next to the appropriate fields and by selecting, from the available drop-down list, the correct data you wish to map to each field. The drop-down list contains all fields that the Machine Learning Extractor, using the endpoint entered in the Machine Learning Extractor wizard, declares as extraction capability.
- Selecting one of the options from a drop-down list automatically confirms that field.
- Select the Save button once all fields are configured properly.
You cannot choose the same option for two distinct fields.
Updated about a month ago