6

Delete evicted pods from all namespaces (also ImagePullBackOff and ErrImagePull)...

 3 years ago
source link: https://gist.github.com/psxvoid/71492191b7cb06260036c90ab30cc9a0
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Delete evicted pods from all namespaces (also ImagePullBackOff and ErrImagePull) · GitHub

Instantly share code, notes, and snippets.

Delete evicted pods from all namespaces (also ImagePullBackOff and ErrImagePull)

Thanks for these commands. I have an AKS cluster and due to long run I saw more than 50 pods are in Failed state and got an exception "The node was low on resource: [Disk Pressure]". Below command I deleted all the failed pods and the above error is gone.
I would like to know how we can automate this script in kubernates where I can run this script once in every month or the Failed pods are very high or when I receive the above error. Is there any way to do this?

Looking forward for your reply.

Thanks,
Binoy

Author

psxvoid commented on Aug 22, 2019

It may depend on specifics of your clusters but here some options that may give you ideas:

  1. Run a script once per month with CronJob Running Automated Tasks with a CronJob
  2. Deploy your custom service\monitoring pod
  3. Use an external solution. For example, for running jobs on particular events you can try Brigade

binoysankar commented on Oct 21, 2019

edited

It may depend on specifics of your clusters but here some options that may give you ideas:

  1. Run a script once per month with CronJob Running Automated Tasks with a CronJob
  2. Deploy your custom service\monitoring pod
  3. Use an external solution. For example, for running jobs on particular events you can try Brigade

Thanks for your reply. I have created a cron to delete all the Failed/Evicted pods to run 59 minutes as below.
In "default" namespace there more and more job and pods getting created and not getting deleted. How to solve this issue.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: delete-failed-pods
spec:
  schedule: "*/59 * * * *"
  failedJobsHistoryLimit: 1
  successfulJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: kubectl-runner
            image: wernight/kubectl
            command: ["sh", "-c", "kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 \" --namespace=\" $1}' | xargs kubectl delete pod --all"]
          restartPolicy: OnFailure

Delete all evicted pods from all namespaces not working for me in macOS.

I use this instead

kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs -I '{}' bash -c 'kubectl delete pods {}'

What should be the command to delete evicted pods from a given namespace?

@ntsh999 you can use the above command from @yeouchien. Just change --all-namespaces to -n your_namespace.

kubectl get pods -n default | grep Evicted | awk '{print $2 " --namespace=" $1}' | xargs -I '{}' bash -c 'kubectl delete pods {}'

@ntsh999 you can use below command. This will delete both Evicted and Failed pods

kubectl get pods --namespace <your_namespace> --field-selector 'status.phase==Failed' -o json | kubectl delete -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa-cronjob-runner
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cronjob-runner
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - watch
  - list
  - delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cronjob-runner
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cronjob-runner
subjects:
- kind: ServiceAccount
  name: sa-cronjob-runner
  namespace: default
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: delete-failed-pods
  namespace: default
spec:
  concurrencyPolicy: Allow
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      template:
        metadata:
          creationTimestamp: null
        spec:
          serviceAccountName: sa-cronjob-runner
          containers:
          - command:
            - sh
            - -c
            - kubectl get pods --all-namespaces | grep Evicted | awk '{print $2 "
              --namespace=" $1}' | xargs kubectl delete pod  --all
            image: wernight/kubectl
            imagePullPolicy: Always
            name: kubectl-runner
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          restartPolicy: OnFailure
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
  schedule: '*/59 * * * *'
  successfulJobsHistoryLimit: 1
  suspend: false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

</div


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK