Troubleshooting
First Response Commands
Common Failure Modes
ErrImagePull/ErrImageNeverPullCrashLoopBackOff- certificate not ready
- pending PVC
- database dependency unreachable
Pod-Level Investigation
Service Recovery Flow
- Identify failing resource and root dependency
- Capture logs and events before restart
- Apply minimal corrective change
- Restart only affected workload
- Re-validate user-facing flow
Recovery Guidelines
- Fix root cause before restarting repeatedly
- Avoid force changes without backup
- Record incident timeline for postmortem