UiPath Documentation
ai-computer-vision
latest
false
  • 概述
    • 简介
    • AI Computer Vision feature availability
  • 设置和配置
    • 身份验证
    • 速率限制
    • 更改服务器端点
    • 单次调用 Computer Vision 和 UiPath Screen OCR 请求
    • 许可
  • 数据存储空间

Ai Computer Vision user guide

上次更新日期 2026年5月8日

简介

AI Computer Vision 是一种基于机器学习的方法,用于以视觉方式识别计算机屏幕上的所有用户界面元素,并通过 UiPath 机器人与用户界面元素进行交互,从而模拟人机交互。 它不需要或不使用应用程序的基础属性,而只需要各种屏幕元素的方面和关系。

AI Computer Vision 不依赖选取器,而是使用 AI(对象检测、OCR、模糊文本匹配、图标图像匹配)和锚点系统将所有这些结合在一起。 更确切地说,为了在屏幕上直观地定位元素,AI 计算机视觉会(在机器学习服务器上)执行元素检测和文本 (OCR) 检测,并将这两者结合起来,形成对用户界面的全面理解。 然后,将使用这两种方法检测到的元素之间的关系编码到多锚点描述符中,该描述符唯一标识目标元素。

AI Computer Vision is composed of a set of activities, that are part of the UI Automation activity package, as well as a server (which can be cloud, on-premises, or local) hosting an AI model, which is needed to perform the actual analysis of the UI you're automating. By default, our UiPath cloud server is used and also recommended for all AI Computer Vision and UI Automation activities. You can use AI Computer Vision cloud regardless of your deployment type. For instance, it does not matter if you have Orchestrator on-premises or Orchestrator cloud, you can run Computer Vision cloud with no special configuration required.

或者,您可以托管和管理自己的本地部署 AI Computer Vision 服务器,并使用它来运行 AI Computer Vision 活动。使用这种类型的服务器时,您需要拥有自己的硬件基础架构 (GPU) 或云环境。此外,您需要在本地部署、更新和维护自己的环境。与 UiPath 云服务器相比,升级 AI 模型时,您可能还会遇到向后兼容性问题。

Local server is another flavour you can opt for. It runs on local CPU and it is the most portable version. However, it is slower and has a slightly lower detection accuracy.

收益

以下是 AI Computer Vision 的一些功能,您可以从中受益:

  • Automation beyond selectors - Enable robots to recognize and interact with more on-screen fields and components - even Flash, Silverlight, PDFs, and images.
  • Reliable on VDIs and desktops - Relieves issues with failure-prone image automation techniques and with selector-based targeting on desktops. Start by creating automations within Citrix, VWware or Microsoft’s Remote Desktop.
  • Broad range of interface types - Includes VDI environments (Citrix, VMWare, Microsoft RDP, VNC, and others) for desktop and web applications. Save your time by getting UI elements identified and added to object repository for you.
  • Intelligent, intuitive capabilities - Provides details, validation, and notifications about on-screen selections via an on-screen wizard. Uses the recorder to easily generate full vision-based automations.
  • Run-time auto-scroll support - Easily automate scrollable content in webpages or apps using AI Computer Vision activities.
  • Cross-platform capabilities - Automate for Windows, Linux, Android and other operating systems through remote desktops.
  • Automation between VDI & non-VDI - Simplifies VDI-to-desktop automation by reducing necessary modifications.
  • Multiple deployment options - Deploys via SaaS; available on-premises for Linux and Windows, or right from your desktop.
  • Dynamic UI elements - Enables automations that include tables, drop-down lists, and checkbox elements. This increases the resilience of your automations, enabling them to adapt to small changes to the UI and interact with these dynamic elements.
  • Available in UI Automation as part of Unified Target - Reduces the complexity of building UI-based automations when you need both selectors and AI Computer Vision descriptors.

部署选项

For a parallel comparison of our existing AI Computer Vision deployment options, check the AI Computer Vision differences section in the Overview guide.

  • 收益
  • 部署选项

此页面有帮助吗?

连接

需要帮助? 支持

想要了解详细内容? UiPath Academy

有问题? UiPath 论坛

保持更新