- 概述
- 要求
- 推荐:部署模板
- 手动:准备安装
- 手动:准备安装
- 步骤 1:为离线安装配置符合 OCI 的注册表
- 步骤 2:配置外部对象存储
- 步骤 3:配置 High Availability Add-on
- 步骤 4:配置 Microsoft SQL Server
- 步骤 5:配置负载均衡器
- 步骤 6:配置 DNS
- 步骤 7:配置内核和操作系统级别设置
- Step 8: Configuring the disks
- 步骤 9:配置节点端口
- 步骤 10:应用其他设置
- 步骤 12:验证并安装所需的 RPM 包
- 步骤 13:生成 cluster_config.json
- 证书配置
- 数据库配置
- 外部对象存储配置
- 预签名 URL 配置
- 符合 OCI 的外部注册表配置
- Disaster Recovery:主动/被动和主动/主动配置
- High Availability Add-on 配置
- 特定于 Orchestrator 的配置
- Insights 特定配置
- Process Mining 特定配置
- Document Understanding 特定配置
- Automation Suite Robot 特定配置
- 监控配置
- 可选:配置代理服务器
- 可选:在多节点 HA 就绪生产集群中启用区域故障恢复
- 可选:传递自定义 resolv.conf
- 可选:提高容错能力
- install-uipath.sh 参数
- 添加具有 GPU 支持的专用代理节点
- 为 Task Mining 添加专用代理节点
- 连接 Task Mining 应用程序
- 为 Automation Suite Robot 添加专用代理节点
- Step 15: Configuring the temporary Docker registry for offline installations
- Step 16: Validating the prerequisites for the installation
- 手动:执行安装
- 安装后
- 集群管理
- 监控和警示
- 迁移和升级
- 特定于产品的配置
- 最佳实践和维护
- 故障排除
Kubernetes 资源警示
The Kube State Metrics collector is not able to collect metrics from the cluster without errors. This means important alerts may not fire. Contact UiPath® Support.
另请参阅:发布时的 Kube 状态指标。
kubectl describe
, and logs with kubectl logs
to see details on possible crashes. If the issue persists, contact UiPath® Support.
kubectl logs
检查 Pod 日志,以查看是否有任何进度指示。如果问题仍然存在,请联系 UiPath® 支持团队。
There has been an attempted update to a deployment or statefulset, but it has failed, and a rollback has not yet occurred. Contact UiPath® Support.
In high availability clusters with multiple replicas, this alert fires when the number of replicas is not optimal. This may occur when there are not enough resources in the cluster to schedule. Check resource utilization, and add capacity as necessary. Otherwise contact UiPath® Support.
An update to a statefulset has failed. Contact UiPath® Support.
另请参阅:有状态副本集。
守护程序集推出失败。请联系 UiPath® 支持团队。
另请参阅:守护程序集。
kubectl describe
of the pod for more information. The most common cause of waiting containers is a failure to pull the image. For air-gapped clusters, this could mean that the local registry is not available. If the issue persists, contact UiPath® Support.
这可能表明其中一个节点存在问题,检查每个节点的运行状况,并修复任何已知问题。否则,请联系 UiPath® 支持团队。
A job takes more than 12 hours to complete. This is not expected. Contact UiPath® Support.
A job has failed; however, most jobs are retried automatically. If the issue persists, contact UiPath® Support.
The autoscaler cannot scale the targeted resource as configured. If desired is higher than actual, then there may be a lack of resources. If desired is lower than actual, pods may be stuck while shutting down. If the issue persists, contact UiPath® Support.
另请参阅:水平 Pod 自动调节
给定服务的副本数量已达到最大值。当对集群发出的请求数量非常多时,就会发生这种情况。如果预计会有暂时的高流量,您可以静默此警示。但是,此警示表示集群已满,无法处理更多流量。如果集群上有更多资源容量可用,您可以按照以下说明增加服务的最大副本数:
# Find the horizontal autoscaler that controls the replicas of the desired resource
kubectl get hpa -A
# Increase the number of max replicas of the desired resource, replacing <namespace> <resource> and <maxReplicas>
kubectl -n <namespace> patch hpa <resource> --patch '{"spec":{"maxReplicas":<maxReplicas>}}'
# Find the horizontal autoscaler that controls the replicas of the desired resource
kubectl get hpa -A
# Increase the number of max replicas of the desired resource, replacing <namespace> <resource> and <maxReplicas>
kubectl -n <namespace> patch hpa <resource> --patch '{"spec":{"maxReplicas":<maxReplicas>}}'
另请参阅:水平 Pod 自动调节。
这些警告表明集群不能容忍节点故障。对于单节点评估集群,这是已知的,并且系统可能会静默这些警示。对于多节点 HA 就绪生产设置,当太多节点运行状况不佳而无法支持高可用性时,将触发这些警示,并指示应将节点恢复正常状态或进行更换。
KubeCPUQuotaOvercommit, KubeMemoryQuotaOvercommit, KubeQuotaAlmostFull, KubeQuotaFullyUsed, KubeQuotaExceeded
这些警示与通过自定义添加的命名空间资源配额有关,这些配额仅存在于集群中。命名空间资源配额不会作为 Automation Suite 安装的一部分添加。
另请参阅:资源配额。
Indicates problems with the Kubernetes control plane. Check the health of master nodes, resolve any outstanding issues, and contact UiPath® Support if the issues persist.
另请参阅:
此警示表示 etcd 集群的成员数量不足。 请注意,集群必须具有奇数个成员。 此警示的严重性非常严重。
确保集群中有奇数个服务器节点,并且所有节点都正常运行。
- k8s. rules、kube-apiserver-availability. rules、kube-apiserver-slos
- KubeAPIErrorBudgetBurn
- kube-state-metrics
- KubeStateMetricsListErrors, KubeStateMetricsWatchErrors
- KubernetesMemoryPressure
- kubernetes-apps
- KubePodCrashLooping
- KubePodNotReady
- KubeDeploymentGenerationMismatch, KubeStatefulSetGenerationMismatch
- KubeDeploymentReplicasMismatch, KubeStatefulSetReplicasMismatch
- KubeStatefulSetUpdateNotRolledOut
- KubeDaemonSetRolloutStuck
- KubeContainerWaiting
- KubeDaemonSetNotScheduled, KubeDaemonSetMisScheduled
- KubeJobCompletion
- KubeJobFailed
- KubeHpaReplicasMismatch
- KubeHpaMaxedOut
- kubernetes-resources
- KubeCPUOvercommit, KubeMemoryOvercommit
- KubeCPUQuotaOvercommit, KubeMemoryQuotaOvercommit, KubeQuotaAlmostFull, KubeQuotaFullyUsed, KubeQuotaExceeded
- AggregatedAPIErrors, AggregatedAPIDown, KubeAPIDown, KubeAPITerminatedRequests
- kubernetes-system-kubelet
- KubeNodeNotReady, KubeNodeUnreachable, KubeNodeReadinessFlapping, KubeletPlegDurationHigh, KubeletPodStartUpLatencyHigh, KubeletDown
- KubeletTooManyPods
- kubernetes-system
- KubeVersionMismatch
- KubeClientErrors
- etdc 警示
- EtcdInsufficientMembers
- EtcdNoLeader
- EtcdHighNumberOfLeaderChanges
- EtcdHighNumberOfFailedGrpcRequests
- EtcdGrpcRequestsSlow
- EtcdHighNumberOfFailedHttpRequests
- EtcdHttpRequestsSlow
- EtcdMemberCommunicationSlow
- EtcdHighNumberOfFailedProposals
- EtcdHighFsyncDurations
- EtcdHighCommitDurations
- kube-api
- KubernetesApiServerErrors