- Overview
- Document Processing Contracts
- Release notes
- About the Document Processing Contracts
- Box Class
- IPersistedActivity interface
- PrettyBoxConverter Class
- IClassifierActivity Interface
- IClassifierCapabilitiesProvider Interface
- ClassifierDocumentType Class
- ClassifierResult Class
- ClassifierCodeActivity Class
- ClassifierNativeActivity Class
- ClassifierAsyncCodeActivity Class
- ClassifierDocumentTypeCapability Class
- ExtractorAsyncCodeActivity Class
- ExtractorCodeActivity Class
- ExtractorDocumentType Class
- ExtractorDocumentTypeCapabilities Class
- ExtractorFieldCapability Class
- ExtractorNativeActivity Class
- ExtractorResult Class
- ICapabilitiesProvider Interface
- IExtractorActivity Interface
- ExtractorPayload Class
- DocumentActionPriority Enum
- DocumentActionData Class
- DocumentActionStatus Enum
- DocumentActionType Enum
- DocumentClassificationActionData Class
- DocumentValidationActionData Class
- UserData Class
- Document Class
- DocumentSplittingResult Class
- DomExtensions Class
- Page Class
- PageSection Class
- Polygon Class
- PolygonConverter Class
- Metadata Class
- WordGroup Class
- Word Class
- ProcessingSource Enum
- ResultsTableCell Class
- ResultsTableValue Class
- ResultsTableColumnInfo Class
- ResultsTable Class
- Rotation Enum
- SectionType Enum
- WordGroupType Enum
- IDocumentTextProjection Interface
- ClassificationResult Class
- ExtractionResult Class
- ResultsDocument Class
- ResultsDocumentBounds Class
- ResultsDataPoint Class
- ResultsValue Class
- ResultsContentReference Class
- ResultsValueTokens Class
- ResultsDerivedField Class
- ResultsDataSource Enum
- ResultConstants Class
- SimpleFieldValue Class
- TableFieldValue Class
- DocumentGroup Class
- DocumentTaxonomy Class
- DocumentType Class
- Field Class
- FieldType Enum
- LanguageInfo Class
- MetadataEntry Class
- TextType Enum
- TypeField Class
- ITrackingActivity Interface
- ITrainableActivity Interface
- ITrainableClassifierActivity Interface
- ITrainableExtractorActivity Interface
- TrainableClassifierAsyncCodeActivity Class
- TrainableClassifierCodeActivity Class
- TrainableClassifierNativeActivity Class
- TrainableExtractorAsyncCodeActivity Class
- TrainableExtractorCodeActivity Class
- TrainableExtractorNativeActivity Class
- Document Understanding Digitizer
- Document Understanding ML
- Document Understanding OCR Local Server
- Document Understanding
- Release notes
- About the Document Understanding activity package
- Project compatibility
- Set PDF Password
- Merge PDFs
- Get PDF Page Count
- Extract PDF Text
- Extract PDF Images
- Extract PDF Page Range
- Extract Document Data
- Create Validation Task and Wait
- Wait for Validation Task and Resume
- Create Validation Task
- Classify Document
- Create Classification Validation Task
- Create Classification Validation Task and Wait
- Wait for Classification Validation Task and Resume
- Intelligent OCR
- Release notes
- About the IntelligentOCR activity package
- Project compatibility
- Configuring Authentication
- Load Taxonomy
- Digitize Document
- Classify Document Scope
- Keyword Based Classifier
- Document Understanding Project Classifier
- Intelligent Keyword Classifier
- Using the Classification Station
- Create Document Classification Action
- Wait For Document Classification Action And Resume
- Train Classifiers Scope
- Keyword Based Classifier Trainer
- Intelligent Keyword Classifier Trainer
- Data Extraction Scope
- Document Understanding Project Extractor
- RegEx Based Extractor
- Form Extractor
- Intelligent Form Extractor
- Present Validation Station
- Create Document Validation Action
- Wait For Document Validation Action And Resume
- Train Extractors Scope
- Export Extraction Results
- ML Services
- OCR
- OCR Contracts
- Release notes
- About the OCR Contracts
- Project compatibility
- IOCRActivity Interface
- OCRAsyncCodeActivity Class
- OCRCodeActivity Class
- OCRNativeActivity Class
- Character Class
- OCRResult Class
- Word Class
- FontStyles Enum
- OCRRotation Enum
- OCRCapabilities Class
- OCRScrapeBase Class
- OCRScrapeFactory Class
- ScrapeControlBase Class
- ScrapeEngineUsages Enum
- ScrapeEngineBase
- ScrapeEngineFactory Class
- ScrapeEngineProvider Class
- OmniPage
- PDF
- [Unlisted] Abbyy
- [Unlisted] Abbyy Embedded
Document Understanding Activities
Using the Classification Station
The Classification Station enables you to perform, review, and correct, document classification information. Once opened, it presents any classification information along with the file being processed. You can organize your documents by using the Split document option. More information about this feature can be found in the Other options section of this page.
The right area of the Classification Station contains an interactive version of the original document, in which text or document sections can be selected, and words can be clicked based on the output of the digitization process. This area also contains options for zooming in and out, selecting and rotating pages, searching through the document, or switching to text view.
The following table shows the available options on the right area of the Classification Station screen, and their descriptions. The area that allows you to interact with the document and select various parts of it.
Option |
Description |
---|---|
Displays all the available keyboard shortcuts supported by the
Classification Station, that can include the following:
| |
Toggles between the text view and image view of the document:
| |
Note: Active only when the Text only
view option is active
|
Sets the selection mode while in text view, that includes the
following options:
|
Sets the selection mode while in image view, that includes the
following options:
| |
| Rotates the current page clockwise. |
| Initiates a search between results in the document used by the Classification Station. |
|
Resets the zoom level on the document. This option is enabled only if the document was previously zoomed in or out. |
Zooms in on the document. | |
Zooms out on the document.
Note:
To zoom in or out, you can also use the CTRL + scroll mouse wheel combination: CTRL+scroll up to view a specific section of the document; CTRL+scroll down to view a larger section of the document. |
This section describes how to use the available options for interacting with documents in the Classification Station.
- Ensure that Image view is selected.
- Select Tokens and then select Custom area.
- Select the desired area in your document.
- Go to the document's more options on the left side, and choose if you want to Change reference or Remove reference.
Similarly to how you select a part of the document using the custom area option within the image view, you do the same within the text view. The only difference is that you ensure that Text view is selected.
There are many keyboard shortcuts that can be used to optimize the human interaction with the Classification Station. We encourage you to use them as much as possible. These can be reviewed in the Keyboard Shortcuts pop-up.
To start using keyboard shortcuts, go to More options, select Keyboard shortcuts, and then select Toggle keyboard shortcuts.
The following list shows the available keyboard shortcuts and their corresponding descriptions:
- Classification
- n: Moves to the next field;
- p: Moves to the previous field;
- s: Splits after the selected page;
- h: Highlights the group reference;
- a: Adds/changes a reference;
- DEL: Removes a reference;
- m + upward arrow key↑: Moves all pages above;
- m + downward arrow key ↓: Moves all pages below;
- left, right arrow keys ←→ + downward, upward arrow keys↑↓: Navigate through pages
- Document:
- d +: Zooms in;
- d -: Zooms out;
- d 0: Resets zoom;
- d r: Rotates the page clockwise;
- d t: Toggles the text mode;
- d s: Changes selection mode;
- d a: Clears the drawn anchor selection;
- /: Initiates a search.
- Accessibility:
- left, right arrow keys ←→ + downward, upward arrow keys↑↓: Navigate through words, and create or more area selection;
- Shift and downward, upward, left, right arrow keys: Resize area selection;
- Enter: Confirm area selection;
- PageDown Page Up: Next or previous page;
- ESC: Deselect all;
- Alt p: Toggle PDF Viewer focus.
- General:
- ?: This screen;
- !: Report document as exception;
- CTRL ENTER: Save classification;
- CTRL DEL: Discard all current changes.
The classification fields are influenced by the Taxonomy and they provide you with three possible situations:
- If the classification information for a given part of the document is provided and is correct, then no action is required for this field.
- If there is no classification information provided for a given part of the document, you can either leave it as Not Classified or select the right document type for it.
- If the page range provided for a given part of the document (classified or not, correctly or not) is not accurate as far as pages go (there are missing pages or extra pages), you can correct these by moving pages to the above or below parts.
On the left side of the screen you can see all classifications. You can select the desired document type (that has been previously defined in Taxonomy) for any given page range of the document, from the document type dropdown list. Hover over the page, select Options, and then select the document type from the dropdown list.
Select Options for the document type to view the dropdown menu with the following options available:
- Add reference - A
reference can be added as support to the document type selection performed by
the user.
Note: A reference is a token or collection of tokens in a document that can be used as keywords to identify the class of the document. References selected by the user are added to the Keyword Learning file through the Train Classifier Scope.Figure 6. The action of adding a reference and highlighting the reference
- Remove reference - Removes
a reference that was previously added for the given document type section.
Figure 7. The action of removing a reference
- Change reference - Changes
the reference to a new one, in case a reference already existed. Select
Change reference, and then select another part of the document.
Figure 8. The action of changing a reference
- Move all pages up - Moves
the entire section of pages up, to the previously defined document type, Option
is active on all sections except for the first one. Using this option will
delete the section you are acting upon, unifying the page range with the
previous one.
Figure 9. The action of moving all pages up
- Move all pages down -
Moves the entire section of pages down, to the next document type defined.
Option is active on all sections except for the last one. Using this option will
delete the section you are acting upon, unifying the page range with the next
one.
Figure 10. The action of moving all pages down
- Split document - Marks the
beginning of a new document type, from where the selection is done, and moves it
under a new document type section.
Figure 11. The action of splitting a document
- Drag and Drop - Allows the
rearrangement of pages between sections. Pages can be rearranged with the
Drag and Drop option only if the order within a document is kept
(page numbers should be consecutive). Drag and drop pages in the document to
rearrange them.
Figure 12. The action of rearranging the pages of a document
- Remove reference - Removes
a reference from a specific page. Select More Options on the document
page and then Remove reference.
Figure 13. The action of removing a reference
- Highlight reference -
Highlights the reference from a specific page. Select More Options on the
document page and then Highlight reference.
Figure 14. The action of highlighting a reference
- Discard changes: Discards all changes done by the user and reverts to the initial state of the validation task.
- Save: Saves the confirmed, corrected data.
- Exception: Reports the document as being an exception.