- Getting Started
- Framework Components
- Document Understanding in AI Center
- Pipelines
- ML Packages
- Data Manager
- OCR Services
- Licensing
- References
Checkboxes & Signatures
In the context of an ML Extraction model, a checkbox is not an actual value, but a way of selecting a certain piece of text.
Because of this, the word next to it becomes the point of focus, not the checkbox. And this is precisely the purpose of the checkbox, to act as an anchor for a specific word.
Consequently, to train an ML model, you need to label the word, not the checkbox.
In some cases, the checkbox is not detected. For instance, the OCR could read it as an X, or maybe it is just some handwritten mark that is not picked up at all. The ML model can learn and associate all of these situations with the word next to the mark.
So, it is more robust to train a model to recognize a word, regardless of how it is selected: with a checkbox, with an X, or with a handwritten mark (circled, underlined, etc.).
For the above example, you can create three fields in Data Manager as follows:
- condition-employment (label the YES word);
- condition-auto-accident (label the YES word);
- condition-other-accident (label the NO word).
The ML model learns to recognize those words, whether they are marked by checkboxes, X’s, or just circled in pen. To do so, you could use UiPath Document OCR which can recognize even checkboxes.
There are cases when there is no label associated with a checkbox. For instance, when checkboxes are part of tables.
Here is a typical example:
In this case, it is necessary to label the boxes. The extractor will return the string value of the checkbox which is one of these two characters:
- ☒
- ☐
checked
or unchecked
. Moreover, the IntelligentOCR framework knows how to recognize these, especially if a field is defined as Boolean
:
- if the extractor returns ☒, this corresponds to YES;
- if the extractor returns ☐, this corresponds to NO.
In cases where an unchecked box is returned as O or D, or when a checked box is returned as X,V,K or R, these can also be included in the RPA workflow logic to make the workflow more robust when these kinds of OCR errors occur.
Signatures are visual features that are not detected by any OCR engine, so an ML model cannot detect them directly.
However, UiPath ML Models learn by looking at both words and pixels on the image. It is possible to do signature detection by making use of this.
Let us take as an example the below form.
At the end of the page, next to the signature, there is the text Signature of U.S. person. It does not matter what the text is, as long as it is close enough to the signature (whenever the signature exists). Dealing with a signature is similar to dealing with a checkbox - see the Checkboxes section above.
You can create a text field called signature and when the document has a signature, you label the words Signature of U.S. person as the signature field. When the document does not have a signature, you leave the field empty.
Then you must make sure that your trainset has approximately half of the documents with a signature and half without. It can be 60/40% too, but not 80/20% or 90/10%. Also, you must have at least 20-30 samples of each for the model to be able to learn this.
In this way, you can use the ML model to do signature detection.