document-understanding
2023.10
false
UiPath logo, featuring letters U and I in white

Document Understanding User Guide

Automation CloudAutomation Cloud Public SectorAutomation SuiteStandalone
Last updated Dec 18, 2024

Export documents

The Export files dialog box enables you to easily export data for training ML models.

Click the Export button from the management bar.

The dialog box contains three tabs:

Export now

The Export now tab allows you to:

  • Download to Excel - Download the data locally in an Excel format.
  • Download - Download the data locally.
  • Export to AI Center - Export the data to AI Center. The exported folders can be found in AI Center under the export folder (Datasets > dataset_name > export).
Note: The Download to Excel function cannot be used if Schema or Backwards-compatible export options are selected.

If no schema is defined, all export options are disabled.

If a schema is defined, it is mandatory to enter a name for your export, otherwise, the Download and Export buttons are disabled. A valid name can have up to 24 characters and should not contain special characters.

You can export or download a schema even if it includes multivalued fields.

You can choose to export one of the following options:

  • Current search results - the labeled documents filtered by a predefined keyword/named batch or by a text query. If no filter is applied, all labeled documents in the current view are exported.
  • All labelled - all documents with at least one labeled field, of any kind; more precisely, the documents from the labelled filter.
  • Schema - a zip file containing the fields and their configurations which can be imported into a different Document Manager session.
  • All - exports all documents, no matter if labels are applied or not.

The Backwards-compatible export checkbox enables you to apply legacy export behavior, which is to export each page as a separate document. Try this if the model trained using default export is below expectations. Leave this unchecked to export the documents in their original multi-page form.

Export validation

To export a dataset, all fields need to be labeled in at least 10 different pages. Otherwise, the export fails with the following messages:





For Classification fields, there is an additional requirement: each option needs to be labeled in at least one document. Otherwise, the export fails with the following message:



When exporting only Evaluation set data, all validations are disabled.

Dataset format

A folder containing the exported dataset coming from Document Manager. This includes:

  • schema.json: a file containing the fields to be extracted and their types
  • split.csv: a file containing the split per each document that will be used either for TRAIN or VALIDATE during the Training Pipeline
  • images: a folder containing images of all the labeled pages
  • latest: a folder containing .json files with the labeled data from each page


Logs

The Logs tab displays the latest log on export.

In case of a successful export, the log shows the number of processed documents and the export duration.



In case of a successful schema export, the log shows the export duration.



During the file export, you can check the status of the export. This is particularly useful for large exports.



Error messages are also displayed in Logs, for instance:



In case of a successful auto-retraining, the import logs from the fine-tune folder of the dataset are displayed as well:



  • Export now
  • Export validation
  • Dataset format
  • Logs

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.