- Overview
- Document Processing Contracts
- Release notes
- About the Document Processing Contracts
- Box Class
- IPersistedActivity interface
- PrettyBoxConverter Class
- IClassifierActivity Interface
- IClassifierCapabilitiesProvider Interface
- ClassifierDocumentType Class
- ClassifierResult Class
- ClassifierCodeActivity Class
- ClassifierNativeActivity Class
- ClassifierAsyncCodeActivity Class
- ClassifierDocumentTypeCapability Class
- ExtractorAsyncCodeActivity Class
- ExtractorCodeActivity Class
- ExtractorDocumentType Class
- ExtractorDocumentTypeCapabilities Class
- ExtractorFieldCapability Class
- ExtractorNativeActivity Class
- ExtractorResult Class
- ICapabilitiesProvider Interface
- IExtractorActivity Interface
- ExtractorPayload Class
- DocumentActionPriority Enum
- DocumentActionData Class
- DocumentActionStatus Enum
- DocumentActionType Enum
- DocumentClassificationActionData Class
- DocumentValidationActionData Class
- UserData Class
- Document Class
- DocumentSplittingResult Class
- DomExtensions Class
- Page Class
- PageSection Class
- Polygon Class
- PolygonConverter Class
- Metadata Class
- WordGroup Class
- Word Class
- ProcessingSource Enum
- ResultsTableCell Class
- ResultsTableValue Class
- ResultsTableColumnInfo Class
- ResultsTable Class
- Rotation Enum
- SectionType Enum
- WordGroupType Enum
- IDocumentTextProjection Interface
- ClassificationResult Class
- ExtractionResult Class
- ResultsDocument Class
- ResultsDocumentBounds Class
- ResultsDataPoint Class
- ResultsValue Class
- ResultsContentReference Class
- ResultsValueTokens Class
- ResultsDerivedField Class
- ResultsDataSource Enum
- ResultConstants Class
- SimpleFieldValue Class
- TableFieldValue Class
- DocumentGroup Class
- DocumentTaxonomy Class
- DocumentType Class
- Field Class
- FieldType Enum
- LanguageInfo Class
- MetadataEntry Class
- TextType Enum
- TypeField Class
- ITrackingActivity Interface
- ITrainableActivity Interface
- ITrainableClassifierActivity Interface
- ITrainableExtractorActivity Interface
- TrainableClassifierAsyncCodeActivity Class
- TrainableClassifierCodeActivity Class
- TrainableClassifierNativeActivity Class
- TrainableExtractorAsyncCodeActivity Class
- TrainableExtractorCodeActivity Class
- TrainableExtractorNativeActivity Class
- Document Understanding Digitizer
- Document Understanding ML
- Document Understanding OCR Local Server
- Document Understanding
- Release notes
- About the Document Understanding activity package
- Project compatibility
- Set PDF Password
- Merge PDFs
- Get PDF Page Count
- Extract PDF Text
- Extract PDF Images
- Extract PDF Page Range
- Extract Document Data
- Create Validation Task and Wait
- Wait for Validation Task and Resume
- Create Validation Task
- Classify Document
- Create Classification Validation Task
- Create Classification Validation Task and Wait
- Wait for Classification Validation Task and Resume
- Intelligent OCR
- Release notes
- About the IntelligentOCR activity package
- Project compatibility
- Configuring Authentication
- Load Taxonomy
- Digitize Document
- Classify Document Scope
- Keyword Based Classifier
- Document Understanding Project Classifier
- Intelligent Keyword Classifier
- Create Document Classification Action
- Wait For Document Classification Action And Resume
- Train Classifiers Scope
- Keyword Based Classifier Trainer
- Intelligent Keyword Classifier Trainer
- Data Extraction Scope
- Document Understanding Project Extractor
- RegEx Based Extractor
- Form Extractor
- Intelligent Form Extractor
- Present Validation Station
- Create Document Validation Action
- Wait For Document Validation Action And Resume
- Train Extractors Scope
- Export Extraction Results
- ML Services
- OCR
- OCR Contracts
- Release notes
- About the OCR Contracts
- Project compatibility
- IOCRActivity Interface
- OCRAsyncCodeActivity Class
- OCRCodeActivity Class
- OCRNativeActivity Class
- Character Class
- OCRResult Class
- Word Class
- FontStyles Enum
- OCRRotation Enum
- OCRCapabilities Class
- OCRScrapeBase Class
- OCRScrapeFactory Class
- ScrapeControlBase Class
- ScrapeEngineUsages Enum
- ScrapeEngineBase
- ScrapeEngineFactory Class
- ScrapeEngineProvider Class
- OmniPage
- PDF
- [Unlisted] Abbyy
- [Unlisted] Abbyy Embedded
Document Understanding Activities
Release notes
Release date: October 3, 2024
You can now use the Generative Classifier and Generative Extractor activities within a Classify Document Scope and Data Extraction Scope, even if the robot is connected to a local Orchestrator.
We've added the RuntimeTenantURL and RuntimeCredentialsAsset properties to the Generative Classifier and Extractor activities. With these properties, you can now directly use credentials from external applications, stored in Orchestrator, to access Document Understanding resources at runtime. To achieve this, ensure that your selected tenant has Document Understanding enabled and AI Units allocated.
Increased prompt size from 500 to 1000 characters per question for enhanced clarity in your instructions.
Release date: 13 August 2024
We've upgraded some internal dependencies for enhanced performance.
Release date: 20 June 2024
We are constantly working to improve your UiPath Document Understanding experience. Even though there are no major significant changes with this release, we made sure to bring minor improvements and accessibility fixes to our product.
Release date: 5 June 2024
We've improved product stability by revising certain dependencies.
Release date: 27 May 2024
Increased prompt size from 500 to 1000 characters per question for enhanced clarity in your instructions. Also, if you reach the prompt size limit of 1000 characters per question, you will receive a "Limit exceeded" error.
Content
Filtered
exception, the activity does not generate any results, as if
the content was missing. The following warning message will show in the robot logs:
GPT refused to handle the request because of content filtering policy.
Returning empty result.
This message is also displayed in Studio when
an automation is initiated from there.
Release date: 1 November 2023
- Generative Classifier activity
- Generative Extractor activity
- A new property, Output Folder, is available for the Machine Learning Classifier Trainer activity. This property allows you to save files locally.
- Support for multi-page fields is now available. This feature is useful when an Address, for example, has the street on one page and the state and zip code on the following page. Due to a known issue, table rows are currently not working in this case. Follow our release notes for updates in the future.
Release date: 7 June 2023
We are constantly working to improve your UiPath Document Understanding experience. Even though there are no major significant changes with this release, we made sure to bring minor improvements and accessibility fixes to our product.
Release date: 26 April 2023
Release date: 27 March 2023
Release date: 15 December 2022
- The UiPath Studio user interface is now available in Traditional Chinese.
- You can now benefit from the API Key field being pre-populated for the following activities included in the UiPath.DocumentUnderstanding.ML.Activities package: Machine Learning Classifier and Machine Learning Extractor.
Release date: 24 October 2022
- The Machine Learning Extractor Trainer activity can now support multivalued fields.
- The UseServerSideOCR option is scheduled to be deprecated on December 2022. We recommend using the default behavior. More details about the deprecation can be found here.
- The Document Understanding Process Studio template has been upgraded to a new version. The UiPath.DocumentUnderstanding.ML.Activities package is a dependency for this template.
- Fixed a bug that was causing extraction errors when Digitizer was used by upgrading the PDF library and using hybrid OCR features.
- Fixed a bug that caused inconsistent input to be sent to ML Extractor when both image and DOM are required.
- The ProxySettings were not used in the
GetCapabilities
call received from Machine Learning Extractor. The bug is now fixed and works as expected.
Release Date: 9 May 2022
- The UiPath.DocumentUnderstanding.ML.Activities package has been upgraded to .NET5 portable, allowing you to run them on Linux robots.
- The Machine Learning Extractor Trainer and the Machine Learning Classifier Trainer activities have received new parameters grouped under the name of Public Datasets, allowing you to use public datasets instead of private ones.
- The Machine Learning Extractor activity has been updated and now the extraction algorithm can also be used from Forms AI, not only from the ML Models list.
- The Machine Learning Extractor activity can now be used with a public endpoint in airgapped scenarios.
Release Date: 5 October 2021
- This release brings as an improvement the telemetry client update to version 1.5.3.
- The UiPath.DocumentUnderstanding.ML.Activities package has been upgraded to .NET5. While both .NET versions continue to be supported, the .NET5 projects can only work on 64-bit architectures.
- Added the Dataset and Project parameters to the Machine Learning Extractor Trainer activity which allow you to select where to upload your training data in your AI Center tenant. As a result, the Endpoint and MLSkill parameters were removed from the activity.
- Added the Endpoint parameter to the Machine Learning Classifier activity which provides the ability to use the activity with public ML Skills.
- The Machine Learning Extractor can now be integrated with Forms AI. The only requirement for this to happen is to make sure that the UseServerSideOCR option is disabled.
Release Date: 29 March 2021
- Released the Machine Learning Classifier and Machine Learning Classifier Trainer activities as part of the Machine Learning Document Classification functionality which helps you classify documents using a custom trained ML model. Machine Learning Classifier could prove to be very useful particularly in scenarios with high diversity in document sets. To train the classifier and improve its results with time with the aid of human validation, you can use the sister activity, Machine Learning Classifier Trainer.
- Improved processing of PDF files.
Release Date: 11 November 2020
Release Date: 5 October 2020
- Released the new Machine Learning Extractor Trainer activity can prepare data for ML model re-training based on human validation results.
- Added the Get or refresh extractor capabilities functionality to Machine Learning Extractor Trainer that can be used to easily map your taxonomy fields with the available extractor fields.
- A new parameter has been included in the Machine Learning Extractor activity, named Timeout (milliseconds). The parameter can be used for specifying the amount of time to wait for a response from the server before an error is thrown.
- Changed the tooltip text on UseServerSideOCR property for Machine Learning Extractor to indicate it is incompatible with Machine Learning Extractor Trainer.
Release Date: 24 August 2020
- Fixed an issue that in some cases was returning a
407ProxyAuthenticationRequired
error message for Kerberos or NTLM authentication requests. This applies to Machine Learning Extractor. - Fixed an issue that was causing the Get Capabilities functionality of Machine Learning Extractor not to work if a certain endpoint was provided.
- Fixed an issue that was causing the Machine Learning Extractor to throw an error when no robot is connected.
Release Date: 4 May 2020
UseServerSideOCR
. This option allows you to use the OCR results
received from digitization.
The Machine Learning Extractor now declares its internal taxonomy, allowing you to easily map the fields it can extract to the fields you have defined in your taxonomy, in the Configure Extractors wizard of the Data Extraction Scope.
- v1.31.2
- Bug fixes
- v1.24.1
- Bug fixes
- v1.28.8
- Bug fixes
- v1.31.1
- What's new
- Support for activities from an on-premises setup
- Improvements
- Bug fixes
- v1.28.7
- v1.28.6
- Bug fixes
- v1.28.5
- Bug fixes
- v1.28.4
- v1.28.3
- v1.28.2
- Improvements
- Bug fixes
- Known issues
- v1.28.1
- New features
- v1.24.0
- Generative Features General Availability
- New Features and Improvements
- v1.21.2
- v1.21.1
- Deprecation Timeline
- v1.17.1
- New features & Improvements
- v1.18.0
- New features & Improvements
- v1.17.0
- New features & Improvements
- Bug Fixes
- v1.13.2
- Improvements
- Bug Fixes
- v1.9.2
- Bug Fix
- v1.9.1
- New Features and Improvements
- v1.7.0
- New Features and Improvements
- v1.5.2
- Bug Fixes
- v1.5.1
- Improvements
- Bug Fixes
- v1.5.0
- New Features and Improvements
- v1.2.2
- Bug Fixes
- v1.2.1
- New features and improvements
- v1.1.0
- New Features and Improvements
- v1.0.0
- New Features and Improvements