Troubleshooting

First Response Commands

kubectl get pods -n dreambau
kubectl get ingress -n dreambau
kubectl get certificate -n dreambau
kubectl get pvc -n dreambau

Common Failure Modes

  • ErrImagePull / ErrImageNeverPull
  • CrashLoopBackOff
  • certificate not ready
  • pending PVC
  • database dependency unreachable

Pod-Level Investigation

kubectl describe pod <pod> -n dreambau
kubectl logs <pod> -n dreambau --previous

Service Recovery Flow

  1. Identify failing resource and root dependency
  2. Capture logs and events before restart
  3. Apply minimal corrective change
  4. Restart only affected workload
  5. Re-validate user-facing flow

Recovery Guidelines

  • Fix root cause before restarting repeatedly
  • Avoid force changes without backup
  • Record incident timeline for postmortem