The UIAutomation activities package contains all the basic activities used for creating automation projects.
"Starting with UiPath.UIAutomation.Activities v19.11, all Abbyy related activities have been moved to a separate package. Install the UiPath.Abbyy.Activities package if you want to use its activities for OCR, Cloud OCR, classification, and data extraction."
These activities enable the robots to:
- Simulate human interaction, such as performing mouse and keyboard commands or typing and extracting text, for basic UI automation.
- Use technologies such as OCR or Image recognition to perform Image and Text Automation.
- Create triggers based on UI behavior, thus enabling the Robot to execute certain actions when specific events occur on a machine.
- Perform browser interaction and window manipulation.
The UiPath.Vision dependency package includes third-party libraries. These external dependencies are used exclusively for the purpose of enabling the implementation of specific activities in the UiPath.UIAutomation.Activities package.
Here are some examples:
AbbyyOnlineSdk.dll - used exclusively in the Abbyy Cloud OCR activity, at run-time, as a wrapper over the Abbyy online service calls.
Interop.FREngine.v11.dll - used exclusively in the Abbyy OCR activity, at run-time, as a wrapper over the Abby FineReader Engine calls.
Interop.MODI.dll - used exclusively in the Microsoft OCR activity, at run-time, when executed on a Windows 7 or Windows Server machine.
As of v2018.3, the
UiPath.Core.Activities package was split into the UIAutomation and System packs. Find out more about the Core Activities Split.
Particular scenarios might require management of strict UIAutomation dependencies versions. For example, a language for the Tesseract OCR engine must be manually installed per UiPath.Vision version. This means that for processes using that language you need to use the corresponding UIAutomation activities package. You can find out more on this page.
The UIAutomation activities package contains the following internally-developed dependencies:
- UiPath.Vision - enables the functionality of OCR and Computer Vision engines.
- UiPath - an essential library for UIAutomation activities.
The table below enlists the dependencies shipped with each version of the UiPath.UIAutomation.Activities package:
Due to the fact that the Computer Vision activities have moved to the UIAutomation pack in 19.10, Installing the UIAutomation v19.10.1 pack in a project that already contains a version of the Computer Vision pack throws an error.
The AI Computer Vision pack contains refactored fundamental UIAutomation activities such as Click, Type Into, or Get Text. The main difference between the CV activities and their classic counterparts is their usage of the Computer Vision neural network developed in-house by our Machine Learning department. The neural network is able to identify UI elements such as buttons, text input fields, or check boxes without the use of selectors.
Created mainly for automation in virtual desktop environments, such as Citrix machines, these activities bypass the issue of inexistent or unreliable selectors, as they send images of the window you are automating to the neural network, where it is analyzed and all UI elements are identified and labeled according to what they are. Smart anchors are used to pinpoint the exact location of the UI element you are interacting with, ensuring the action you intend to perform is successful.
To use the Computer Vision activities in the current project, you need an ApiKey which can be obtained from the Automation Cloud as detailed here
The ApiKey must then be inserted in the ApiKey field in the Computer Vision Project Settings property category. You can see the Project Settings page for more information.
The settings regarding the server connection are project-wide, and are reflected in all subsequent CV Screen Scope activities.
All of the activities in this pack only function when inside a CV Screen Scope activity, which establishes the actual connection to the neural network server, thus enabling you to analyze the UI of the apps you want to automate. Any workflow using the Computer Vision activities must begin with dragging a CV Screen Scope activity to the Designer panel. Once this is done, the Indicate on screen button in the body of the scope activity can be used to select the area of the screen that you want to work in.
Double-clicking the informative screenshot displays the image that has been captured and highlights in purple all of the UI elements that have been identified by the neural network and OCR engine.
Area selection can also be used to indicate only a portion of the UI of the application you want to automate. This is especially useful in situations where there are multiple text fields that have the same label and cannot be properly identified.
Once a CV Screen Scope activity is properly configured, you can start using all of the other activities in the pack to build your automation.
The activities that perform actions on UI elements can be configured at design time by using the Indicate On Screen button present in the body of the activities. The activities that have this feature are:
Clicking the Indicate On Screen (hotkey: I) button opens the helper wizard.
The CV Click, CV Hover, and CV Type Into activities also feature a Relative To button in the helper wizard, which enables you to configure the target as being relative to an element.
The Indicate field specifies what you are indicating at the moment. When the helper is opened for the first time, the Target needs to be indicated. For each possible target, the wizard automatically selects an anchor, if one is available.
After successfully indicating the Target, the wizard closes and the activity is configured with the target you selected.
If no unique anchor is automatically identified, the Indicate field informs you of this fact, enabling you to indicate additional Anchors, which make the target easier to find.
The Show Elements (hotkey: s) button in the wizard highlights all UI elements that have been identified by the Computer Vision analysis, making it easier for you to choose what to interact with.
The Refresh Scope (hotkey: F5) button can be used at design time, in case something changes in the target app, enabling you to send a new picture to the CV server to be analyzed again.
The Refresh After Delay (hotkey: F2) button performs a refresh of the target app after waiting 3 seconds.
Please remember that whenever you choose to submit errors in the behavior of the neural network, you are helping it learn and indirectly helping us give you a better product. Submit as many issues as you can, as this gives us the opportunity to acknowledge and fix them.
Updated 20 days ago