Automation Suite

2023.4

False

Linux 版 Automation Suite 安装指南

上次更新日期 2024年4月24日

使用监控堆栈

Automation Suite 集群的监控堆栈包括集成在 Rancher Cluster Explorer 用户界面中的 Prometheus、Grafana 和 Alertmanager。

备注：

节点故障可能会导致 Kubernetes 关闭，从而中断 Prometheus 警示。为防止出现这种情况，我们建议在 RKE2 服务器上设置单独的警示。

本页描述了一系列监控方案。有关更多详细信息，请参阅有关使用 Rancher 监控的官方 Rancher 文档。

重要提示：

使用收集器将指标导出到第三方工具时，如果启用应用程序监控，可能会中断 Automation Suite 的正常运行。

访问监控工具

概述

Automation Suite 集群的监控堆栈包括 Prometheus、Grafana、Alert Manager 和 Longhorn Dashboard。

本页描述了一系列监控方案。

有关更多详细信息，请参阅有关使用 Rancher 监控的官方 Rancher 文档。

您可以使用以下 URL 单独访问 Automation Suite 监控工具：

应用程序	工具	URL	示例
指标	Prometheus	`https://monitoring.fqdn/metrics`	`https://monitoring.automationsuite.mycompany.com/metrics`
仪表板	Grafana	`https://monitoring.fqdn/dashboard`	`https://monitoring.automationsuite.mycompany.com/dashboard`
警示管理	警示管理器	`https://monitoring.fqdn/alertmanager`	`https://monitoring.automationsuite.mycompany.com/alertmanager`
暂留块存储	Longhorn 仪表板	`https://monitoring.fqdn`	`https://monitoring.automationsutie.mycompany.com`

身份验证

首次访问监控工具时，请使用以下默认凭据以管理员身份登录：

用户名： admin
密码：要检索密码，请运行以下命令：
```
kubectl get secrets/dex-static-credential -n uipath-auth -o "jsonpath={.data['password']}" | base64 -dkubectl get secrets/dex-static-credential -n uipath-auth -o "jsonpath={.data['password']}" | base64 -d
```

要更新用于访问监控工具的默认密码，请执行以下步骤：

通过将 newpassword 替换为您的新密码来运行以下命令：

password="newpassword"
password=$(echo -n $password | base64)
kubectl patch secret dex-static-credential -n uipath-auth --type='json' -p="[{'op': 'replace', 'path': '/data/password', 'value': '$password'}]"password="newpassword"
password=$(echo -n $password | base64)
kubectl patch secret dex-static-credential -n uipath-auth --type='json' -p="[{'op': 'replace', 'path': '/data/password', 'value': '$password'}]"

通过将 <cluster_config.json> 替换为配置文件的路径来运行以下命令：

/opt/UiPathAutomationSuite/UiPath_Installer/install-uipath.sh -i <cluster_config.json> -f -o output.json --accept-license-agreement/opt/UiPathAutomationSuite/UiPath_Installer/install-uipath.sh -i <cluster_config.json> -f -o output.json --accept-license-agreement

检查当前触发的警示

要查看警示，请使用 https://monitoring.fqdn/metrics 导航到 Prometheus，然后单击“警示”选项卡。您可以在此处看到 Automation Suite 中配置的所有警示。

要查看活动警示，请单击顶部的“触发”复选框和“显示注释”复选框，以筛选警示状态。您可以在此处看到当前触发的所有警示及其相应的消息。

静默警示

如果警示过于嘈杂，您可以将其静音。为此，请执行以下步骤：

单击监控仪表板左上角的“警示管理器”图块。系统将显示以下屏幕：
找到相关警示，然后选择“静默”。
填写“创建者”和“注释”详细信息，然后单击“创建”。警示应该不会再显示在监控仪表板上，也应该不会报告给任何已配置的接收器。

配置警示

备注：

You can find uipathctl in the Automation Suite installation folder: .../UiPathAutomationSuite/UiPath_Installer/bin.

添加新的电子邮件配置

要在安装后添加新的电子邮件配置，请运行以下命令：

./uipathctl config alerts add-email \
  --name test \
  --to "admin@example.com" \
  --from "admin@example.com" \
  --smtp server.mycompany.com \
  --username admin \
  --password somesecret \
  --require-tls \
  --ca-file <path_to_ca_file> \
  --cert-file <path_to_cert_file> \
  --key-file <path_to_key_file> \
  --send-resolved./uipathctl config alerts add-email \
  --name test \
  --to "admin@example.com" \
  --from "admin@example.com" \
  --smtp server.mycompany.com \
  --username admin \
  --password somesecret \
  --require-tls \
  --ca-file <path_to_ca_file> \
  --cert-file <path_to_cert_file> \
  --key-file <path_to_key_file> \
  --send-resolved

标记	描述	示例
`name`	电子邮件配置名称	`testconfig`
`to`	收件人的电子邮件地址	`admin@example.com`
`from`	发件人的电子邮件地址	`admin@example.com`
`SMTP`	SMTP 服务器 URL 或 IP 地址和端口号	`server.mycompany.com:567`
`username`	身份验证用户名	`admin`
`password`	身份验证密码	`securepassword`
`require-tls`	表示已在 SMTP 服务器上启用 TLS 的布尔值标志。	不适用
`ca-file`	包含 SMTP 服务器的 CA 证书的文件路径。如果 CA 为私有，则这是可选项。	`./ca-file.crt`
`cert-file`	包含 SMTP 服务器证书的文件路径。如果证书为私有，则这是可选项。	`./cert-file.crt`
`key-file`	包含 SMTP 服务器 CA 证书的文件路径。如果证书是私有，则这是必需项。	`./key-file.crt`
`send-resolved`	解决警示后发送电子邮件的布尔值标志。	不适用

删除电子邮件配置

要删除电子邮件配置，您必须运行以下命令。确保传递要删除的电子邮件配置的名称。

./uipathctl config alerts remove-email --name test./uipathctl config alerts remove-email --name test

更新电子邮件配置

要更新电子邮件配置，您必须运行以下命令。确保传递要更新的电子邮件配置的名称以及要编辑的其他可选参数。这些参数与添加新电子邮件配置的参数相同。您可以同时传递一个或多个标志。

./uipathctl config alerts update-email --name test [additional_flags]./uipathctl config alerts update-email --name test [additional_flags]

访问 Grafana 仪表板

要访问 Grafana 仪表板，您必须检索凭据并使用它们进行登录：

用户名:

kubectl -n cattle-monitoring-system get secrets/rancher-monitoring-grafana -o "jsonpath={.data.admin-user}" | base64 -d; echokubectl -n cattle-monitoring-system get secrets/rancher-monitoring-grafana -o "jsonpath={.data.admin-user}" | base64 -d; echo

密码:

kubectl -n cattle-monitoring-system get secrets/rancher-monitoring-grafana -o "jsonpath={.data.admin-password}" | base64 -d; echokubectl -n cattle-monitoring-system get secrets/rancher-monitoring-grafana -o "jsonpath={.data.admin-password}" | base64 -d; echo

监控服务网格

您可以通过以下 Grafana 仪表板监控 Istio 服务网格：Istio 网格和 Istio 工作负载。

Istio 网格仪表板

此仪表板显示所选时间段内整个服务网格的整体请求量以及 400 和 500 错误率。数据显示在窗口的右上角。有关此信息，请参阅顶部的 4 张图表。

它还显示每项服务在过去一分钟内的即时成功率。请注意，成功率为 NaN 表示该服务当前未提供流量。

Istio 工作负载仪表板

此仪表板显示窗口右上角所选时间范围内的流量指标。

使用仪表板顶部的选取器深入了解特定的工作负载。特别值得一提的是 UiPath 命名空间。

顶部显示整体指标，“入站工作负载”部分根据来源分离流量，“出站服务”部分根据目标分离流量。

监控持久卷

您可以通过 Kubernetes/持久卷仪表板监控持久卷。您可以跟踪每个卷的可用空间和已用空间。

您还可以通过单击 Cluster Explore 的“存储”菜单中的“持久卷”项目来检查每个卷的状态。

监控硬件利用率

要检查每个节点的硬件利用率，您可以使用节点仪表板。可以使用 CPU、内存、磁盘和网络上的数据。

您可以使用 Kubernetes/计算资源/命名空间（工作负载）仪表板监控特定工作负载的硬件利用率。选择 UiPath 命名空间以获取所需的数据。

创建可共享的 Grafana 图表可视化快照

单击图表标题旁边的向下箭头，然后选择“共享”。
单击“快照”选项卡，然后设置“快照名称”、“过期”和“超时”。
单击“发布”到 snapshot.raintank.io。

有关更多详细信息，请参阅有关共享仪表板的 Grafana 文档。

注意：任何知道此链接的人都可以在公共互联网上查看此快照。

创建自定义持久性 Grafana 仪表板

有关如何创建自定义持久性 Grafana 仪表板的详细信息，请参阅 Rancher 文档。

对 Grafana 的管理员访问权限

在 Automation Suite 集群中，通常不需要对 Grafana 的管理员访问权限，因为在默认情况下，匿名用户可以读取访问仪表板，而创建自定义持久性仪表板必须使用本文档上面链接的 Kubernetes 原生说明。

不过，使用以下说明可以对 Grafana 进行管理员访问。

可以按如下方式检索 Grafana 管理员访问权限的默认用户名和密码：

kubectl get secret -n cattle-monitoring-system rancher-monitoring-grafana -o jsonpath='{.data.admin-user}' | base64 -d && echo
kubectl get secret -n cattle-monitoring-system rancher-monitoring-grafana -o jsonpath='{.data.admin-password}' | base64 -d && echokubectl get secret -n cattle-monitoring-system rancher-monitoring-grafana -o jsonpath='{.data.admin-user}' | base64 -d && echo
kubectl get secret -n cattle-monitoring-system rancher-monitoring-grafana -o jsonpath='{.data.admin-password}' | base64 -d && echo

请注意，在高可用性 Automation Suite 集群中存在多个 Grafana Pod，以便在节点出现故障以及进行大量读取查询时实现不间断的读取访问。这与管理员访问权限不兼容，因为 Pod 不共享会话状态，并且登录需要它。为了解决此问题，需要管理员访问权限时，必须将 Grafana 副本的数量临时增加到 1。有关如何扩展 Grafana 副本数量的说明，请参见下文：

# scale down
kubectl scale -n cattle-monitoring-system deployment/rancher-monitoring-grafana --replicas=1
# scale up
kubectl scale -n cattle-monitoring-system deployment/rancher-monitoring-grafana --replicas=2# scale down
kubectl scale -n cattle-monitoring-system deployment/rancher-monitoring-grafana --replicas=1
# scale up
kubectl scale -n cattle-monitoring-system deployment/rancher-monitoring-grafana --replicas=2

查询 Prometheus

在监控仪表板上单击“Prometheus 图形”。系统将显示一个新窗口。

有关可用指标的文档如下：

创建自定义警示

您可以使用带有布尔值表达式的 Prometheus 查询创建自定义警示。

为此，请在监控仪表板的“高级”菜单中单击“Prometheus 规则”。
单击窗口右上角的“创建”，以新建警示，然后按照 Rancher 文档：Prometheus 规则进行操作
当警示触发时，它应显示在监控仪表板上。此外，它将路由到任何已配置的接收器。

监控 Kubernetes 资源状态

要查看 Pod、部署、状态副本集等的状态，可以使用 Cluster Explorer 用户界面。这与登录到 Rancher-Server 端点后访问的登录页面相同。主页将显示摘要，并在左侧向下钻取每种资源类型的特定详细信息。请注意页面顶部的命名空间选取器。此仪表板也可以替换为 Lens 工具。

将 Prometheus 指标导出到外部系统

Prometheus 使用 Prometheus 远程写入功能收集 Prometheus 指标并将其导出到外部系统。

注意：UiPath™ 不支持或维护远程写入端点集成。但是，端点与 Automation Suite 中提供的 Prometheus 实例兼容。

要在 Automation Suite 集群上 remote_write 配置 remote_write：

连接到 ArgoCD。
单击 “应用程序”。
导航到 结构安装程序。
打开“ 应用程序详细信息 ”面板并禁用 自我修复。
导航到 rancher-monitoring 应用程序。
打开“应用程序详细信息”面板 >“清单”选项卡。
单击“编辑”，导航到“值”>“Prometheus”>“Prometheus 规范”部分。
添加所需的 remoteWrite 配置。

探索远程写入功能的可用配置。
保存新配置。在应用新配置之前，rancher-monitoring 应用程序将显示“无法同步”。

注意：无需重新启动 Prometheus 即可应用新的远程写入配置。
测试所需的远程写入集成。返回到步骤 8 以添加新配置。