Manual: Migrating Ceph Data Pool From Replicated to Erasure-coded Type

Step 1: Selecting a path for Ceph objects

You must select a file system path that has storage space available to hold Ceph objects. For instance, let's assume you decide to use the /ceph-data path on the server0 Kubernetes node.

Important: You must align the Ceph tool to use this host path, and all subsequent commands must be executed on the same machine (in this case, server0).

export ROOK_CEPH_EXPORT_PATH="/ceph-data"export ROOK_CEPH_EXPORT_PATH="/ceph-data"

To view the storage space used by objects in the Ceph cluster, run the following command:

ceph_objects_bytes=$(kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph status --format json | jq -r '.pgmap.data_bytes')
numfmt --to=iec-i $ceph_objects_bytesceph_objects_bytes=$(kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph status --format json | jq -r '.pgmap.data_bytes')
numfmt --to=iec-i $ceph_objects_bytes

Step 2: Preparing the Ceph tools to use the path

To prepare the Ceph tools to use the path selected at Step 1, take the following steps:

Disable self-heal for rook-ceph-object-store:

kubectl -n argocd patch application rook-ceph-operator --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":false}]'
kubectl -n argocd patch application rook-ceph-object-store --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":false}]'
kubectl -n argocd patch application fabric-installer --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":false}]'kubectl -n argocd patch application rook-ceph-operator --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":false}]'
kubectl -n argocd patch application rook-ceph-object-store --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":false}]'
kubectl -n argocd patch application fabric-installer --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":false}]'

Edit Ceph tools deployment to mount ${ROOK_CEPH_EXPORT_PATH} on kubernetes node: server0

kubectl -n rook-ceph patch deploy rook-ceph-tools --type='json' -p='[{"op": "add", "path":"/spec/template/spec/nodeName", "value": "server0"},{"op": "add", "path":"/spec/template/spec/volumes/2", "value": {"name":"ceph-export", "hostPath": {"path": "'${ROOK_CEPH_EXPORT_PATH}'", "type":"Directory"}  }}, {"op":"add", "path": "/spec/template/spec/containers/0/volumeMounts/2", "value": {"name": "ceph-export", "mountPath": "'${ROOK_CEPH_EXPORT_PATH}'"}},{"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits"}]'
kubectl -n rook-ceph rollout status deploy rook-ceph-toolskubectl -n rook-ceph patch deploy rook-ceph-tools --type='json' -p='[{"op": "add", "path":"/spec/template/spec/nodeName", "value": "server0"},{"op": "add", "path":"/spec/template/spec/volumes/2", "value": {"name":"ceph-export", "hostPath": {"path": "'${ROOK_CEPH_EXPORT_PATH}'", "type":"Directory"}  }}, {"op":"add", "path": "/spec/template/spec/containers/0/volumeMounts/2", "value": {"name": "ceph-export", "mountPath": "'${ROOK_CEPH_EXPORT_PATH}'"}},{"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits"}]'
kubectl -n rook-ceph rollout status deploy rook-ceph-tools

Allow the tools pods to write inside ${ROOK_CEPH_EXPORT_PATH}

chmod 777 ${ROOK_CEPH_EXPORT_PATH}chmod 777 ${ROOK_CEPH_EXPORT_PATH}

Step 3: Blocking access to `rook-ceph` namespace from other namespaces

Block traffic coming towards the rook-ceph namespace from any other namespace except the rook-ceph namespace:

kubectl apply -f - <<EOF
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  namespace: rook-ceph
  name: block-rook-ceph-from-other-ns
spec:
  podSelector:
    matchLabels:
  ingress:
  - from:
    - podSelector: {}
EOFkubectl apply -f - <<EOF
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  namespace: rook-ceph
  name: block-rook-ceph-from-other-ns
spec:
  podSelector:
    matchLabels:
  ingress:
  - from:
    - podSelector: {}
EOF

Restart RGW deployment to close already established connections from other namespaces:

for rgw_deploy in $(kubectl -n rook-ceph get deploy -l app=rook-ceph-rgw  -o name);do
    kubectl -n rook-ceph rollout restart "${rgw_deploy}"
    kubectl -n rook-ceph rollout status "${rgw_deploy}"
donefor rgw_deploy in $(kubectl -n rook-ceph get deploy -l app=rook-ceph-rgw  -o name);do
    kubectl -n rook-ceph rollout restart "${rgw_deploy}"
    kubectl -n rook-ceph rollout status "${rgw_deploy}"
done

Step 4: Viewing cluster object count

To view the cluster object count, run the following command:

BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT=$(kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados df --format json | jq -r --arg poolName "rook-ceph.rgw.buckets.data" '.pools[] | select(.name==$poolName).num_objects')
echo "BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT=${BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT}"BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT=$(kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados df --format json | jq -r --arg poolName "rook-ceph.rgw.buckets.data" '.pools[] | select(.name==$poolName).num_objects')
echo "BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT=${BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT}"

It is recommended to re-check the object count after migration to ensure no data loss occurred.

Step 5: Exporting Ceph data pool

To export the Ceph data pool, run the following command:

nohup kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados -p 'rook-ceph.rgw.buckets.data' export --workers 5   ${ROOK_CEPH_EXPORT_PATH}/ceph-data-pool >> /tmp/ceph-data-pool-export.log 2>&1 & 
wait $!
if [[ $? -eq 0 && -f ${ROOK_CEPH_EXPORT_PATH}/ceph-data-pool ]]; then
  echo "Export ran successfully"
else
 echo "Error while running export"
finohup kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados -p 'rook-ceph.rgw.buckets.data' export --workers 5   ${ROOK_CEPH_EXPORT_PATH}/ceph-data-pool >> /tmp/ceph-data-pool-export.log 2>&1 & 
wait $!
if [[ $? -eq 0 && -f ${ROOK_CEPH_EXPORT_PATH}/ceph-data-pool ]]; then
  echo "Export ran successfully"
else
 echo "Error while running export"
fi

Step 6: Scaling down the `rook-ceph` operator

To scale down the rook-ceph operator, run the following command:

kubectl -n rook-ceph scale --replicas=0 deployment/rook-ceph-operatorkubectl -n rook-ceph scale --replicas=0 deployment/rook-ceph-operator

Step 7: Recreating the erasure-coded pool

To recreate the erasure-coded pool, run the following command:

kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool rm rook-ceph.rgw.buckets.data rook-ceph.rgw.buckets.data --yes-i-really-really-mean-it --yes-i-really-really-mean-it-not-faking
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd crush rule rm rook-ceph.rgw.buckets.data
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd erasure-code-profile set rook-ceph_ecprofile k=2 m=1  crush-failure-domain=host
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool create rook-ceph.rgw.buckets.data erasure rook-ceph_ecprofile
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool set rook-ceph.rgw.buckets.data compression_mode none
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool application enable rook-ceph.rgw.buckets.data rook-ceph-rgw  --yes-i-really-mean-itkubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool rm rook-ceph.rgw.buckets.data rook-ceph.rgw.buckets.data --yes-i-really-really-mean-it --yes-i-really-really-mean-it-not-faking
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd crush rule rm rook-ceph.rgw.buckets.data
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd erasure-code-profile set rook-ceph_ecprofile k=2 m=1  crush-failure-domain=host
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool create rook-ceph.rgw.buckets.data erasure rook-ceph_ecprofile
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool set rook-ceph.rgw.buckets.data compression_mode none
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- ceph osd pool application enable rook-ceph.rgw.buckets.data rook-ceph-rgw  --yes-i-really-mean-it

Step 8: Importing data into the data pool

To import the data into the data pool, run the following command:

nohup kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados -p 'rook-ceph.rgw.buckets.data'  import --workers 5   ${ROOK_CEPH_EXPORT_PATH}/ceph-data-pool >> /tmp/ceph-data-pool-import.log 2>&1 & 
wait $!
if [[ $? -eq 0 ]]; then
  echo "Import ran successfully"
else
 echo "Error while running import"
finohup kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados -p 'rook-ceph.rgw.buckets.data'  import --workers 5   ${ROOK_CEPH_EXPORT_PATH}/ceph-data-pool >> /tmp/ceph-data-pool-import.log 2>&1 & 
wait $!
if [[ $? -eq 0 ]]; then
  echo "Import ran successfully"
else
 echo "Error while running import"
fi

Step 9: Verifying loaded data

To verify the loaded data, run the following command:

try=120
return_code=1
for index in $(seq 0 "${try}"); do
  AFTER_MIGRATION_DATA_POOL_OBJECT_COUNT=$(kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados df --format json | jq -r --arg poolName "rook-ceph.rgw.buckets.data" '.pools[] | select(.name==$poolName).num_objects')
  if [[ $AFTER_MIGRATION_DATA_POOL_OBJECT_COUNT -eq $BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT ]]; then
    return_code=0
    break
  fi
  [[ $index -eq $try ]] || sleep 5
done
if [[ $return_code -eq 0 ]]; then
  echo "Found equal object count(${BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT})"
else
  echo "Found difference in object count for pool before(${BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT}) and after(${AFTER_MIGRATION_DATA_POOL_OBJECT_COUNT})"
  echo "Please raise a support ticket with uipath to complete the migration"
fitry=120
return_code=1
for index in $(seq 0 "${try}"); do
  AFTER_MIGRATION_DATA_POOL_OBJECT_COUNT=$(kubectl -n rook-ceph exec deploy/rook-ceph-tools -- rados df --format json | jq -r --arg poolName "rook-ceph.rgw.buckets.data" '.pools[] | select(.name==$poolName).num_objects')
  if [[ $AFTER_MIGRATION_DATA_POOL_OBJECT_COUNT -eq $BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT ]]; then
    return_code=0
    break
  fi
  [[ $index -eq $try ]] || sleep 5
done
if [[ $return_code -eq 0 ]]; then
  echo "Found equal object count(${BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT})"
else
  echo "Found difference in object count for pool before(${BEFORE_MIGRATION_DATA_POOL_OBJECT_COUNT}) and after(${AFTER_MIGRATION_DATA_POOL_OBJECT_COUNT})"
  echo "Please raise a support ticket with uipath to complete the migration"
fi

Step 10: Reverting temporary changes

To revert temporary changes, run the following command:

kubectl -n argocd patch application rook-ceph-operator --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":true}]'
kubectl -n argocd patch application rook-ceph-object-store --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":true}]'
kubectl -n argocd patch application fabric-installer --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":true}]'
kubectl -n rook-ceph scale --replicas=1 deployment/rook-ceph-operator
kubectl -n rook-ceph patch deploy rook-ceph-tools --type='json' -p='[{"op": "remove", "path":"/spec/template/spec/nodeName"},{"op": "remove", "path":"/spec/template/spec/volumes/2"}, {"op":"remove", "path": "/spec/template/spec/containers/0/volumeMounts/2"},{"op": "add", "path": "/spec/template/spec/containers/0/resources/limits", "value": {"memory": "256Mi"}}]'
kubectl -n rook-ceph delete NetworkPolicy block-rook-ceph-from-other-nskubectl -n argocd patch application rook-ceph-operator --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":true}]'
kubectl -n argocd patch application rook-ceph-object-store --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":true}]'
kubectl -n argocd patch application fabric-installer --type=json -p '[{"op":"replace","path":"/spec/syncPolicy/automated/selfHeal","value":true}]'
kubectl -n rook-ceph scale --replicas=1 deployment/rook-ceph-operator
kubectl -n rook-ceph patch deploy rook-ceph-tools --type='json' -p='[{"op": "remove", "path":"/spec/template/spec/nodeName"},{"op": "remove", "path":"/spec/template/spec/volumes/2"}, {"op":"remove", "path": "/spec/template/spec/containers/0/volumeMounts/2"},{"op": "add", "path": "/spec/template/spec/containers/0/resources/limits", "value": {"memory": "256Mi"}}]'
kubectl -n rook-ceph delete NetworkPolicy block-rook-ceph-from-other-ns

Step 11: Updating ArgoCD configuration

You must now ensure you keep the configuration and actual state in sync. To do that, update the ArgoCD configuration by running the following command:

kubectl  -n argocd get application  fabric-installer -o json | jq 'if ([.spec.source.helm.parameters[].name] | index ("global.rook.dataPoolType")) == null then .spec.source.helm.parameters +=  [{"name": "global.rook.dataPoolType" , "value": "erasure-coded"}] else (.spec.source.helm.parameters[] | select(.name == "global.rook.dataPoolType").value) |= "erasure-coded" end'  | kubectl apply  -f -kubectl  -n argocd get application  fabric-installer -o json | jq 'if ([.spec.source.helm.parameters[].name] | index ("global.rook.dataPoolType")) == null then .spec.source.helm.parameters +=  [{"name": "global.rook.dataPoolType" , "value": "erasure-coded"}] else (.spec.source.helm.parameters[] | select(.name == "global.rook.dataPoolType").value) |= "erasure-coded" end'  | kubectl apply  -f -