Enables the collection of data that has been processed through Validation Station so that it can be imported into Data Manager. This activity can be used only within the Train Extractors Scope activity.
- DisplayName - The display name of the activity.
- Private - If selected, the values of variables and arguments are no longer logged at Verbose level.
- Endpoint - The endpoint hosting the machine learning model. For more information, see Document Understanding Public Endpoints.
- MLSkill - Provides the MLSkills list available in the AI Fabric service.
- OutputFolder - Directory to which the collected data is stored. The folder must then be compressed to a zip file and imported into Data Manager.
The same rule as for the Machine Learning Extractor applies to the Machine Learning Extractor Trainer. See here.
Below are the steps that you need to follow for using the Machine Learning Extractor Trainer activity.
- Use the Taxonomy Manager Wizard to define your document types and fields.
- Drag a Machine Learning Extractor Trainer in a Train Extractors Scope activity.
- In the Machine Learning Extractor wizard that automatically opens, add the Endpoint information.
- Select the checkbox for the Update activity arguments if you wish to also use the entered values as input arguments for the activity, more precisely for the Endpoint.
- Click the Get Capabilities button. The wizard closes after this operation.
- Enter a value for Output Folder.
- Select the Configure Extractors option of the Train Extractors Scope. A wizard is displayed.
- The Machine Learning Extractor Trainer is now ready for configuration. Expand the document type that you want to apply it for, and start selecting the fields you want to train, by clicking the checkboxes next to the appropriate fields.
- Fill in the textboxes either manually or by selecting, from the available drop-down list, the correct data you wish to map to each field. The drop-down list contains all fields that the Machine Learning Extractor Trainer, using the endpoint entered in the Machine Learning Extractor wizard, declares as extraction capability.
If you click the checkbox but you leave the textbox empty, the latter will be automatically filled in with the Document Type ID from the local taxonomy. The changes apply after saving. Should you want to avoid using a long string for the field ID, we would recommend you to manually enter a value in case you do not have access to the internal taxonomy of the extractor.
- To check if you are using the latest capabilities of the extractor, you can click the Get or refresh extractor capabilities which opens the Machine Learning Extractor wizard.
- Selecting one of the options from a drop-down list automatically confirms that field.
- To train an extractor based on its extraction result, you can set the exact alphanumeric value in the Framework Alias field previously used for an extractor.
- Select the Save button once all fields are configured properly.
You cannot choose the same option for two distinct fields.
Updated 2 months ago