- Overview
- Document Processing Contracts
- Release notes
- About the Document Processing Contracts
- Box Class
- IPersistedActivity interface
- PrettyBoxConverter Class
- IClassifierActivity Interface
- IClassifierCapabilitiesProvider Interface
- ClassifierDocumentType Class
- ClassifierResult Class
- ClassifierCodeActivity Class
- ClassifierNativeActivity Class
- ClassifierAsyncCodeActivity Class
- ClassifierDocumentTypeCapability Class
- ExtractorAsyncCodeActivity Class
- ExtractorCodeActivity Class
- ExtractorDocumentType Class
- ExtractorDocumentTypeCapabilities Class
- ExtractorFieldCapability Class
- ExtractorNativeActivity Class
- ExtractorResult Class
- ICapabilitiesProvider Interface
- IExtractorActivity Interface
- ExtractorPayload Class
- DocumentActionPriority Enum
- DocumentActionData Class
- DocumentActionStatus Enum
- DocumentActionType Enum
- DocumentClassificationActionData Class
- DocumentValidationActionData Class
- UserData Class
- Document Class
- DocumentSplittingResult Class
- DomExtensions Class
- Page Class
- PageSection Class
- Polygon Class
- PolygonConverter Class
- Metadata Class
- WordGroup Class
- Word Class
- ProcessingSource Enum
- ResultsTableCell Class
- ResultsTableValue Class
- ResultsTableColumnInfo Class
- ResultsTable Class
- Rotation Enum
- SectionType Enum
- WordGroupType Enum
- IDocumentTextProjection Interface
- ClassificationResult Class
- ExtractionResult Class
- ResultsDocument Class
- ResultsDocumentBounds Class
- ResultsDataPoint Class
- ResultsValue Class
- ResultsContentReference Class
- ResultsValueTokens Class
- ResultsDerivedField Class
- ResultsDataSource Enum
- ResultConstants Class
- SimpleFieldValue Class
- TableFieldValue Class
- DocumentGroup Class
- DocumentTaxonomy Class
- DocumentType Class
- Field Class
- FieldType Enum
- LanguageInfo Class
- MetadataEntry Class
- TextType Enum
- TypeField Class
- ITrackingActivity Interface
- ITrainableActivity Interface
- ITrainableClassifierActivity Interface
- ITrainableExtractorActivity Interface
- TrainableClassifierAsyncCodeActivity Class
- TrainableClassifierCodeActivity Class
- TrainableClassifierNativeActivity Class
- TrainableExtractorAsyncCodeActivity Class
- TrainableExtractorCodeActivity Class
- TrainableExtractorNativeActivity Class
- Document Understanding Digitizer
- Document Understanding ML
- Document Understanding OCR Local Server
- Document Understanding
- Release notes
- About the Document Understanding activity package
- Project compatibility
- Set PDF Password
- Merge PDFs
- Get PDF Page Count
- Extract PDF Text
- Extract PDF Images
- Extract PDF Page Range
- Extract Document Data
- Create Validation Task and Wait
- Wait for Validation Task and Resume
- Create Validation Task
- Classify Document
- Create Classification Validation Task
- Create Classification Validation Task and Wait
- Wait for Classification Validation Task and Resume
- Intelligent OCR
- Release notes
- About the IntelligentOCR activity package
- Project compatibility
- Configuring Authentication
- Load Taxonomy
- Digitize Document
- Classify Document Scope
- Keyword Based Classifier
- Document Understanding Project Classifier
- Intelligent Keyword Classifier
- Create Document Classification Action
- Wait For Document Classification Action And Resume
- Train Classifiers Scope
- Keyword Based Classifier Trainer
- Intelligent Keyword Classifier Trainer
- Data Extraction Scope
- Document Understanding Project Extractor
- RegEx Based Extractor
- Form Extractor
- Intelligent Form Extractor
- Present Validation Station
- Create Document Validation Action
- Wait For Document Validation Action And Resume
- Train Extractors Scope
- Export Extraction Results
- ML Services
- OCR
- OCR Contracts
- Release notes
- About the OCR Contracts
- Project compatibility
- IOCRActivity Interface
- OCRAsyncCodeActivity Class
- OCRCodeActivity Class
- OCRNativeActivity Class
- Character Class
- OCRResult Class
- Word Class
- FontStyles Enum
- OCRRotation Enum
- OCRCapabilities Class
- OCRScrapeBase Class
- OCRScrapeFactory Class
- ScrapeControlBase Class
- ScrapeEngineUsages Enum
- ScrapeEngineBase
- ScrapeEngineFactory Class
- ScrapeEngineProvider Class
- OmniPage
- PDF
- [Unlisted] Abbyy
- [Unlisted] Abbyy Embedded
Document Understanding Activities
Release notes
Release date: October 3, 2024
We are constantly working to improve your UiPath Document Understanding experience. Even though there are no significant changes this release, we made minor improvements to the product.
Release date: 29 April 2024
These release notes contain all the updates made between November 2023 and March 2024.
Two new methods are available for the ExtractionResult Class. These new features help you access validator notes easier.
Release date: 2 May 2023
We fixed a bug causing the Data Extraction Scope activity to crash when the extraction is completed on all but the first sub-document. This was happening when a classifier was used to perform document splitting and multiple classification results were returned from Classify Document Scope.
Release date: 26 April 2023
- The ResultsValue Class received a new property, TextType that can provide the text origin of the extracted value.
- We've added new overload for the .Serialize method that accepts a
SerializationSettings object. You can now configure whether
serialization will be done using a Pascal- or camelCase convention. The default
is PascalCase. This applies to the following classes: Document,
DocumentTaxonomy, DocumentSplittingResult,
ClassificationResult, and ExtractionResult.
You can still use all the existing methods in PascalCase and everything is backwards compatible.
You can use deserialization with either Pascal- or camelCase serialized objects.
- New classes have been added to the UiPath.DocumentProcessing.Contracts package that contains information about the Actions created in Action Center.
- A new helper method, IsTextTypeInDocument, is available for detecting the presence of a type of text (handwritten or checkboxes) in a document by using a TextType parameter.
We've fixed a bug causing the TextType property to be displayed in both Values and Tokens sections. The property is now displayed only in the Values section.
Release date: 24 October 2022
- We have re-factored the Data Extraction Results class to enable its usage within the workflow. We've implemented a new table structure and table helper methods and, for the moment, both old and new structures are available. The new format is supported in Validation Station and other components (trainers, extractors etc.), and going forward it is the recommended way of accessing and manipulating table data in the results object.
- Every word from Document Object Model can now be set as type Text, Handwriting, or Checkbox.
- The ResultsValue.Components property, the ResultsValue constructors which have ResultsDataPoint[] components as parameters, the ResultsValue.CreateTableValue methods, and the ExtractionResult.FlattenFields method are now marked as obsolete.
- There is a new ProcessingSource value, named PdfAndOcr, that reflects a PDF page which was processed with both native PDF processing and OCR processing.
Release Date: 9 May 2022
- The UiPath.DocumentProcessing.Contracts package has been upgraded to .NET5 portable, allowing you to run them on Linux robots.
- Some of the classes included in the UiPath.DocumentProcessing.Contracts package have been updated. Among the updated ones are the Simplified Value, Results Value, and Extraction Result classes.
Release Date: 1 October 2021
- The ClassificationResult class has received new methods, Serialize and Deserialize, meant to help you serialize and deserialize the classification output.
- New methods, FlattenFields and GetFields, have also been added to the ExtractionResult class meant to help you filter the fields based on the given condition.
- The UiPath.DocumentProcessing.Contracts package has been upgraded to .NET5. While both .NET versions continue to be supported, the .NET5 projects can only work on 64-bit architectures.
Release Date: 23 March 2021
ExtractionResult.AsDataSet(bool includeConfidence, bool includeOcrConfidence)
method that can be used to export an ExtractionResult
to a DataSet
while optionally including the OCR Confidence of the values.
Release Date: 2 October 2020
- Value extraction null (empty) is allowed in the following scenarios:
- Data extraction fields with no reference;
- Document classifications with no reference;
- Values created by the used in Validation Station with no reference.
- Added support for custom extractor payload in
ExtractionResults
. - Added support for field-level metadata in Taxonomy.
Release Date: 4 May 2020
- A new property named
VisualLineNumber
has been added to the public class Word indicating on which visual line is the word placed. This can be found under the UiPath.DocumentProcessing.Contracts.Dom namespace. - A new extension method named
GetVisualTextProjection
has been added to the Document class, allowing you to access a visual arrangement of the words. This can be found under the UiPath.DocumentProcessing.Contracts.Dom namespace.
Release Date: 19 June 2019
Release Date: 21 May 2019
Release Date: 22 April 2019
- Reference the contracts provided in this package.
- Implement your own classifiers and extractors into your workflows.
- v1.27.1
- v1.26.0
- New Feature & Improvements
- v1.23.1
- New Features & Improvements
- v1.21.1
- Bug Fixes
- v1.21.0
- New features & Improvements
- Bug Fixes
- Deprecation Timeline
- v1.18.0
- New features & Improvements
- v1.17.1
- New features & Improvements
- v1.14.0
- New Features and Improvements
- v1.11.0
- New Features and Improvements
- Bug Fixes
- v1.10.1
- Improvements
- Bug Fixes
- v1.9.1
- Bug Fixes
- v1.9.0
- New Features and Improvements
- v1.6.1
- New Features and Improvements
- v1.4.0
- New Features and Improvements
- v1.3.0
- New Features and Improvements
- v1.2.0
- New Features and Improvements
- v1.1.0
- New Features and Improvements
- v1.0.0
- New Features and Improvements