automation-suite
2023.10
true
- 概述
- 要求
- 安装
- 先决条件检查
- 下载安装包
- uipathctl cluster
- uipathctl 集群维护
- uipathctl cluster maintenance disable
- uipathctl cluster maintenance enable
- uipathctl cluster maintenance is-enabled
- uipathctl cluster migration
- uipathctl cluster migration export
- uipathctl cluster migration import
- uipathctl cluster migration run
- uipathctl 集群升级
- uipathctl 配置
- uipathctl config add-host-admin
- uipathctl config additional-ca-certificates
- uipathctl config additional-ca-certificates get
- uipathctl config additional-ca-certificates update
- uipathctl 配置警示
- uipathctl configalerts add-email
- uipathctl config alerts remove-email
- uipathctl config alerts update-email
- uipathctl config argocd
- uipathctl config argocd ca-certificates
- uipathctl config argocd ca-certificates get
- uipathctl config argocd ca-certificates update
- uipathctl config argocd generate-dex-config
- uipathctl config argocd generate-rbac
- uipathctl config argocd registry
- uipathctl config argocd registry get
- uipathctl config argocd registry update
- uipathctl config enable-basic-auth
- uipathctl config Orchestrator
- uipathctl config Orchestrator get-config
- uipathctl config orchestrator update-config
- uipathctl config saml-certificates get
- uipathctl config saml-certificates rotate
- uipathctl config saml-certificates update
- uipathctl config tls-certificates
- uipathctl config tls-certificates get
- uipathctl config tls-certificates update
- uipathctl config token-signing-certificates
- uipathctl config token-signing-certificates get
- uipathctl config token-signing-certificates rotate
- uipathctl config token-signing-certificates update
- uipathctl 运行状况
- uipathctl 运行状况捆绑包
- uipathctl 运行状况检查
- uipathctl health diagnose
- uipathctl health test
- uipathctl 清单
- uipathctl manifest apply
- uipathctl manifest diff
- uipathctl manifest get
- uipathctl manifest get-revision
- uipathctl manifest list-applications
- uipathctl manifest list-revisions
- uipathctl manifest render
- uipathctl 先决条件
- uipathctl prereq create
- uipathctl prereq run
- uipathctl 资源
- uipathctl 资源报告
- uipathctl 快照
- uipathctl 快照备份
- uipathctl snapshot backup create
- uipathctl snapshot backup disable
- uipathctl snapshot backup enable
- uipathctl snapshot delete
- uipathctl snapshot list
- uipathctl snapshot restore
- uipathctl snapshot restore create
- uipathctl snapshot restore delete
- uipathctl snapshot restore history
- uipathctl snapshot restore logs
- uipathctl 版本
- 安装后
- 迁移和升级
- 监控和警示
- 集群管理
- 特定于产品的配置
- 故障排除
EKS/AKS 上的 Automation Suite 安装指南
Last updated 2024年9月12日
故障排除
在 AKS 上安装 Automation Suite 后,当您检查 Automation Suite Robot Pod 的运行状况时,它会返回运行状况不佳的状态:“[POD_UNHEALTY] Pod asrobots-miigrations-cvzfn 在命名空间 uipath 中处于故障状态”。
在极少数情况下,Orchestrator 和 Automation Suite Robot 的数据库迁移可能会同时运行。 在这种情况下,迁移 Automation Suite Robot 的数据库将失败。 在 Argo CD 中,您可以看到两个迁移 Pod:一个处于正常运行状态,另一个处于不运行正常状态。
您可以通过执行以下步骤来解决此问题:
- 创建名为
velerosecrets.txt
的文件,其中包含以下内容:AZURE_CLIENT_SECRET=<secretforserviceprincipal> AZURE_CLIENT_ID=<clientidforserviceprincipal> AZURE_TENANT_ID=<tenantidforserviceprincipal> AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal> AZURE_CLOUD_NAME=AzureUSGovernmentCloud AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster>
AZURE_CLIENT_SECRET=<secretforserviceprincipal> AZURE_CLIENT_ID=<clientidforserviceprincipal> AZURE_TENANT_ID=<tenantidforserviceprincipal> AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal> AZURE_CLOUD_NAME=AzureUSGovernmentCloud AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster> - 将
velerosecrets.txt
文件中的数据编码为 Base64:export b64velerodata=$(cat velerosecrets.txt | base64)
export b64velerodata=$(cat velerosecrets.txt | base64) - 更新
velero
命名空间中的velero-azure
密码,如以下示例所示:apiVersion: v1 kind: Secret metadata: name: velero-azure namespace: velero data: cloud: <insert the $b64velerodata value here>
apiVersion: v1 kind: Secret metadata: name: velero-azure namespace: velero data: cloud: <insert the $b64velerodata value here> - 重新启动
velero
部署:kubectl rollout restart deploy -n velero
kubectl rollout restart deploy -n velero
要解决此问题,请创建网络策略以允许从集群 CIDR 或
0.0.0.0/0
进入 admctl
Webhook 的流量。
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-all-ingress-to-admctl
namespace: uipath
spec:
podSelector:
matchLabels:
app: admctl-webhook
ingress:
- from:
- ipBlock:
cidr: <cluster-pod-cdr> or "0.0.0.0/0"
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: allow-all-ingress-to-admctl
namespace: uipath
spec:
podSelector:
matchLabels:
app: admctl-webhook
ingress:
- from:
- ipBlock:
cidr: <cluster-pod-cdr> or "0.0.0.0/0"
Pod 无法在代理环境中与 FQDN 通信,并显示以下错误:
System.Net.Http.HttpRequestException: The proxy tunnel request to proxy 'http://<proxyFQDN>:8080/' failed with status code '404'.
System.Net.Http.HttpRequestException: The proxy tunnel request to proxy 'http://<proxyFQDN>:8080/' failed with status code '404'.
要解决此问题,您必须创建
ServiceEntry
,如以下示例所示:
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: proxy
namespace: uipath
spec:
hosts:
- <proxy-host>
addresses:
- <proxy-ip>/32
ports:
- number: <proxy-port>
name: tcp
protocol: TCP
location: MESH_EXTERNAL
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: proxy
namespace: uipath
spec:
hosts:
- <proxy-host>
addresses:
- <proxy-ip>/32
ports:
- number: <proxy-port>
name: tcp
protocol: TCP
location: MESH_EXTERNAL
将 Azure 文件与 NFS 协议一起使用时,故障主要发生在已启用 FIPS 的节点上。
在 AKS 上安装 Automation Suite 期间,为 Automation Suite Robot
asrobots-pvc-package-cache
创建 PVC 失败。
发生这种情况是因为 AKS 集群无法连接到 Azure 文件。
例如,可能会显示以下错误消息:
failed to provision volume with StorageClass "azurefile-csi-nfs": rpc error: code = Internal desc = update service endpoints failed with error: failed to get the subnet ci-asaks4421698 under vnet ci-asaks4421698: &{false 403 0001-01-01 00:00:00 +0000 UTC {"error":{"code":"AuthorizationFailed","message":"The client '4c200854-2a79-4893-9432-3111795beea0' with object id '4c200854-2a79-4893-9432-3111795beea0' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/64fdac10-935b-40e6-bf28-f7dc093f7f76/resourceGroups/ci-asaks4421698/providers/Microsoft.Network/virtualNetworks/ci-asaks4421698/subnets/ci-asaks4421698' or the scope is invalid. If access was recently granted, please refresh your credentials."}}}
failed to provision volume with StorageClass "azurefile-csi-nfs": rpc error: code = Internal desc = update service endpoints failed with error: failed to get the subnet ci-asaks4421698 under vnet ci-asaks4421698: &{false 403 0001-01-01 00:00:00 +0000 UTC {"error":{"code":"AuthorizationFailed","message":"The client '4c200854-2a79-4893-9432-3111795beea0' with object id '4c200854-2a79-4893-9432-3111795beea0' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/64fdac10-935b-40e6-bf28-f7dc093f7f76/resourceGroups/ci-asaks4421698/providers/Microsoft.Network/virtualNetworks/ci-asaks4421698/subnets/ci-asaks4421698' or the scope is invalid. If access was recently granted, please refresh your credentials."}}}
从 2023.4.3 升级到 2023.10 时,您会遇到配置 AI Center 的问题。
系统显示以下异常,并且租户创建失败:
"exception":"sun.security.pkcs11.wrapper.PKCS11Exception: CKR_KEY_SIZE_RANGE
您可以通过执行以下步骤来解决此问题:
-
从正在运行的集群捕获现有的
coredns
配置映射:kubectl get configmap -n kube-system coredns -o yaml > coredns-config.yaml
kubectl get configmap -n kube-system coredns -o yaml > coredns-config.yaml -
编辑
coredns-config.yaml
文件以将fqdn
重写附加到配置中。-
将配置映射重命名为
coredns-custom
。 -
将以下代码块添加到
coredns-config.yaml
文件。确保代码块位于kubernetes cluster.local in-addr.arpa ip6.arp
行之前。rewrite stop { name exact <cluster-fqdn> istio-ingressgateway.istio-system.svc.cluster.local }
rewrite stop { name exact <cluster-fqdn> istio-ingressgateway.istio-system.svc.cluster.local } -
将
<cluster-fqdn>
替换为实际值。
apiVersion: v1 data: Corefile: | .:53 { errors log health rewrite stop { name exact mycluster.autosuite.com istio-ingressgateway.istio-system.svc.cluster.local } kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf cache 30 loop reload loadbalance } kind: ConfigMap metadata: name: coredns-custom namespace: kube-system
apiVersion: v1 data: Corefile: | .:53 { errors log health rewrite stop { name exact mycluster.autosuite.com istio-ingressgateway.istio-system.svc.cluster.local } kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf cache 30 loop reload loadbalance } kind: ConfigMap metadata: name: coredns-custom namespace: kube-system -
-
创建
coredns-custom
配置映射:kubectl apply -f coredns-config.yaml
kubectl apply -f coredns-config.yaml -
将
coredns
中的卷引用替换为kube-system
命名空间coredns
部署中的coredns-custom
:volumes: - emptyDir: {} name: tmp - configMap: defaultMode: 420 items: - key: Corefile path: Corefile name: coredns-custom name: config-volume
volumes: - emptyDir: {} name: tmp - configMap: defaultMode: 420 items: - key: Corefile path: Corefile name: coredns-custom name: config-volume -
重新启动
coredns
部署,并确保coredns
Pod 正常运行:kubectl rollout restart deployment -n kube-system coredns
kubectl rollout restart deployment -n kube-system coredns -
您现在应该可以启动 Automation Hub 和 Apps。
要解决此问题,请执行以下步骤:
-
确保 Helm 3.14 在用于安装 Automation Suite 的 Jumpbox 或笔记本电脑上运行。
-
提取失败的 Helm 图表的配置值,在本例中为 Velero:
helm -n velero get values velero > customvals.yaml
helm -n velero get values velero > customvals.yaml -
在
.image.imagePullSecrets
路径下的customvals.yaml
文件中添加缺少的映像拉取密码:image: imagePullSecrets: - uipathpullsecret
image: imagePullSecrets: - uipathpullsecret -
如果已安装 Velero,请将其卸载:
helm uninstall -n velero velero
helm uninstall -n velero velero -
创建名为
velerosecrets.txt
的新文件。使用您的特定信息填充其中,如以下示例所示:AZURE_CLIENT_SECRET=<secretforserviceprincipal> AZURE_CLIENT_ID=<clientidforserviceprincipal> AZURE_TENANT_ID=<tenantidforserviceprincipal> AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal> AZURE_CLOUD_NAME=AzurePublicCloud AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster>
AZURE_CLIENT_SECRET=<secretforserviceprincipal> AZURE_CLIENT_ID=<clientidforserviceprincipal> AZURE_TENANT_ID=<tenantidforserviceprincipal> AZURE_SUBSCRIPTION_ID=<subscriptionidforserviceprincipal> AZURE_CLOUD_NAME=AzurePublicCloud AZURE_RESOURCE_GROUP=<infraresourcegroupoftheakscluster> -
对
velerosecrets.txt
文件进行编码:export b64velerodata=$(cat velerosecrets.txt | base64)
export b64velerodata=$(cat velerosecrets.txt | base64) -
从
velero
命名空间中删除velero-azure
密码。包括以下内容:apiVersion: v1 kind: Secret metadata: name: velero-azure namespace: velero data: cloud: <put the $b64velerodata value here>
apiVersion: v1 kind: Secret metadata: name: velero-azure namespace: velero data: cloud: <put the $b64velerodata value here> -
重新安装 Velero:
helm install velero -n velero <path to velero - 3.1.6 helm chart tgz> -f customvals.yaml
helm install velero -n velero <path to velero - 3.1.6 helm chart tgz> -f customvals.yaml