Automation Suite
2023.4
false
Banner background image
Automation Suite on Linux Installation Guide
Last updated Apr 24, 2024

Pod fails to come up after node reboot due to filesystem corruption

Description

Occasionally, when the host gets rebooted, the insights-insightslooker pod fails to come up due to a volume attachment issue. When this happens, the insights app gets stuck in progressing state, as shown in the following image:
docs image
If you check the insights-insightslooker pod in the ArgoCD UI, you should get the following error message:
docs image

Solution

To fix the issue, take the following steps:

  1. Identify the faulty volume. In the previous message, it is pvc-5abe3c8f-7422-44da-9132-92be5641150a.
  2. Scale down the workload that uses the affected volume. Ensure that the volume is detached from the node. To check if the volume is detached, run the following command:

    kubectl get volumes.longhorn.io -n longhorn-system |grep <PV>kubectl get volumes.longhorn.io -n longhorn-system |grep <PV>
  3. Manually attach the faulty volume to any node from the Longhorn UI.

  4. Log in to the node and fix the device corresponding to that volume by running the following command:

    fsck.ext4 /dev/longhorn/<ERRORED_VOLUME>fsck.ext4 /dev/longhorn/<ERRORED_VOLUME>

    For details, see the following example:

    docs image
  5. After repairing the faulty volume, detach it from the node. You can do this from the Longhorn UI.

  6. Scale up the workload.

  7. The pod should come up automatically and, after some time, become healthy.

  • Description
  • Solution

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.