- Overview
- Document Understanding Process
- Quickstart Tutorials
- Framework Components
- ML Packages
- Pipelines
- Document Manager
- OCR Services
- Document Understanding deployed in Automation Suite
- Document Understanding deployed in AI Center standalone
- Deep Learning
- Licensing
- References
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.DocumentProcessing.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities
Document Understanding User Guide
Checkboxes and Signatures
Multiple choice fields that use checkboxes may be of a few different kinds. First there are the mutually exclusive kinds, then there are the non-mutually exclusive ones, where more than one option may be selected. Another important aspect is the number of choices available for a given multiple choice field. In some cases there may be a single option, where the checkbox is either checked or not, while in other cases there may be 10, 20, or more options, arranged in a grid or table, like on many health forms.
There are four major ways in which you may label these kinds of multiple choice fields.
Let's take an example to understand how you can label the options. Forms can include the options Project or Policy. In this case, you only have one field, and you only label the selected word, i.e. label the word Project if the checkbox next to it is checked or the word Policy if the checkbox next to it is checked. If neither is checked then you label neither, and both being checked is not possible, and such documents would just be deleted from the training set.
This approach has the advantage that you have a single field, which requires less data. It also has the advantage that it does not rely on a successful detection of checkboxes. If a checkbox is detected as a letter X, the model can still learn to recognize that it means the option next to it is selected.
The disadvantage is that you need to make sure both options are roughly equally represented, which is not always the case. Potentially, in your training set, 90% of the documents might have Project checked. In this case, the model cannot perform well and this approach fails. The problem gets worse when you have more options because some of them are almost always rare. In these cases you may need to create fake documents with the rare options checked to balance things out.
In the above example, you may have one field called Project where you always label the checkbox for Project, and one field called policy where you always label the checkbox for Policy, whether they are checked or not. This has the advantage that the balance matters a lot less, even if one of the options is checked 90% of the time, the model still learns to recognize them because the checkboxes are always in the same place.
The downside is that you have two fields instead of one. When there are two options this may not be a big deal, but when there are 10-20 options, having 10-20 fields instead of one makes it a lot more difficult to label, and the model is harder to train, requiring more training data.
Another downside is that sometimes the checkbox might not be detected correctly and you may need to add more complex logic in the workflow to handle all the X, V or K returned characters. In some cases the OCR might even merge the checkbox with the word next to it, like XProject, requiring an even more complex RPA logic to handle this situation.
Multivalued fields are part of the 2022.10 release of Document Understanding. This makes it easier to label, it is not affected by unbalanced choices being checked, and it is not affected if there is a large number of options. However it still relies on the accuracy of the checkbox detection or the risk that checkboxes might be merged with the options next to them. OCR errors are very hard to defend against.
This also makes it easier to label, is less sensitive to checkbox detection errors, but it might be more sensitive to unbalanced options, just like the first option.
In our experience, all of these options may be appropriate in some situations. We initially preferred the first option, however, as the accuracy of the checkbox detection in UiPath Document OCR has improved, we have gravitated more towards option two and three. Options two and three also have another major advantage: they are compatible between Forms AI and our AI Center based ML packages. So you can start with Forms AI, and then if you find the accuracy is lower than expected, you can just move the dataset to a Document Manager session and just train an ML model directly without any other changes. This option has become particularly interesting as our ML packages have gotten more powerful and require less training data.
Starting with the 2022.4 LTS Enterprise release, signatures can be detected using the UiPath Document OCR, hence, Machine Learning Models can directly detect signatures.
Label a signature like any other field is labelled in your document. Once detected by the UiPath Document OCR, the Machine Learning Model learns to recognize the field as a signature.