실습
Manual Scheduling
Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx nodeName: worker-1 EOF
Pod 상태 확인
kubectl get pod nginx
위의 명령어를 실행하면 아래와 같은 결과를 확인할수 있습니다.
NAME READY STATUS RESTARTS AGE nginx 0/1 Pending 0 4s
1분 정도 시간이 지난후에 Pod 상태를 확인
kubectl get pod nginx
Kubernetes 공식문서 리뷰 - https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename
Pod Garbage Collector 코드 리뷰 - https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/podgc/gc_controller.go#L212
노드 목록 확인
kubectl get node
spec.nodeName에 위의 명령어의 결과로 나온 첫번째 노드 이름을 넣고 Pod를 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx nodeName: $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') EOF
Pod 상태 확인
kubectl get pod nginx
spec.nodeName을 명시하지 않고 Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: httpd spec: containers: - image: httpd name: httpd EOF
spec.nodeName을 명시하지 않고 생성한 Pod에 발생한 Event 확인
kubectl describe pod httpd
spec.nodeName을 명시하고 생성한 Pod에 발생한 Event 확인
kubectl describe pod nginx
Pod 삭제
kubectl delete pod nginx httpd
Node Selector
Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx nodeSelector: env: dev EOF
Pod 상태 확인
kubectl get pod nginx
위의 명령어를 실행하면 아래와 같은 결과를 확인할수 있습니다.
NAME READY STATUS RESTARTS AGE nginx 0/1 Pending 0 16s
아래의 명령어를 사용해서 Pod가 배포되지 않는 이유를 찾으세요.
kubectl describe pod nginx
Node에 부여된 Label 확인
kubectl get node --show-labels
두번째 노드에 키(Key)는 env 이고 값(Value)은 dev 인 Label 부여
kubectl label node \ $(kubectl get node -o=jsonpath='{.items[1].metadata.name}') env=dev
Node에 Label이 부여되었는지 확인
kubectl get node --show-labels | grep -E 'env=dev|$'
Pending 상태였던 Pod가 배포됐는지 확인
kubectl get pod nginx
Pod에 발생한 Event 확인
kubectl describe pod nginx
Node에 부여한 Label 삭제
kubectl label node \ $(kubectl get node -o=jsonpath='{.items[1].metadata.name}') env-
Node에 Label이 삭제되었는지 확인
kubectl get node --show-labels | grep -E 'env=dev|$'
Pod 상태 확인
kubectl get pod nginx
Pod 삭제
kubectl delete pod nginx
Node Affinity
Node에 부여된 Label 확인
kubectl get node --show-labels
각 Node의 가용영역 확인
kubectl get nodes --label-columns topology.kubernetes.io/zone
Pod 생성
for i in {1..4} do cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx-${i} labels: app: nginx spec: containers: - image: nginx name: nginx resources: requests: cpu: 0.5 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - ap-northeast-2a - ap-northeast-2b preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 preference: matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - ap-northeast-2a - weight: 1 preference: matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - ap-northeast-2b EOF done
Pod가 배포된 노드 확인
kubectl get pod -l app=nginx -o wide
Node의 가용영역 확인
kubectl get nodes --label-columns topology.kubernetes.io/zone
각각의 Pod가 어떤 절차에 의해서 각각의 Node에 스케쥴링 되었는지 확인
Pod 삭제
kubectl delete pod -l app=nginx
아래의 YAML 파일에 명시된 spec.affinity가 의미하는 것을 설명하고 Pod 생성
for i in {1..4} do cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx-${i} labels: app: nginx spec: containers: - image: nginx name: nginx resources: requests: cpu: 0.5 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: NotIn values: - ap-northeast-2a preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 preference: matchExpressions: - key: topology.kubernetes.io/zone operator: NotIn values: - ap-northeast-2b EOF done
Pod가 배포된 노드 확인
kubectl get pod -l app=nginx -o wide
각각의 Pod가 어떤 절차에 의해서 각각의 Node에 스케쥴링 되었는지 확인
Pod 삭제
kubectl delete pod -l app=nginx
Pod Affinity
Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: mod-security labels: security: enhanced app: nginx spec: containers: - name: modsecurity image: owasp/modsecurity:nginx-alpine EOF
아래의 YAML 파일에 명시된 spec.template.spec.affinity가 의미하는 것을 설명하고 Deployment 생성
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx name: nginx affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: security operator: In values: - enhanced topologyKey: kubernetes.io/hostname EOF
Pod가 배포된 노드 확인
kubectl get pod -l app=nginx -o wide
Pod 삭제
kubectl delete pod mod-security
mod-security Pod가 삭제되어서 Deployment에 명시한 Affinity 규칙이 더 이상 성립되지 않습니다. 이 경우 어떤일이 발생하는지 설명하세요.
Deployment 삭제
kubectl delete deploy nginx
아래의 YAML 파일에 명시된 spec.template.spec.affinity가 의미하는 것을 설명하고 Deployment 생성
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: redis labels: app: cache spec: selector: matchLabels: app: cache replicas: 4 template: metadata: labels: app: cache spec: containers: - name: redis-server image: redis:3.2-alpine affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - cache topologyKey: kubernetes.io/hostname EOF
Pod가 배포된 노드 확인
kubectl get pod -l app=cache -o wide
아래의 YAML 파일에 명시된 spec.template.spec.affinity가 의미하는 것을 설명하고 Deployment 생성
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx name: nginx affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - cache topologyKey: kubernetes.io/hostname podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - nginx topologyKey: kubernetes.io/hostname EOF
Pod가 배포된 노드 확인
kubectl get pod -l 'app in (nginx,cache)' -o wide
생성된 리소스 삭제
kubectl delete deploy -l 'app in (nginx,cache)'
Spread Pod Across Cluster
nodeSelector를 명시한 Pod 생성
for s in web api queue cache database do cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: ${s} labels: app: myapp spec: nodeSelector: topology.kubernetes.io/zone: ap-northeast-2a containers: - image: busybox name: busybox command: ["/bin/sleep"] args: ["infinity"] EOF done
각 Pod가 배포된 Node 확인
kubectl get pod -l app=myapp -o wide -A --sort-by=.spec.nodeName
각 Node별로 배포된 Pod 갯수 확인
kubectl describe node | grep -E "(^Name:|^Non-terminated)"
nodeSelector를 명시하지 않은 Pod 생성
for i in {1..5} do cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx-${i} labels: app: nginx spec: containers: - image: nginx name: nginx EOF done
각 Pod가 배포된 Node 확인
kubectl get pod -l app=nginx -o wide --sort-by=.spec.nodeName
각 Node별로 배포된 Pod 갯수 확인
kubectl describe node | grep -E "(^Name:|^Non-terminated)"
위에서 nodeSelector를 명시하지 않고 생성한 Pod 삭제
kubectl delete pod -l app=nginx
각 Pod가 배포된 Node 확인
kubectl get pod -l app=myapp -o wide --sort-by=.spec.nodeName
각 Node별로 배포된 Pod 갯수 확인
kubectl describe node | grep -E "(^Name:|^Non-terminated)"
Deployment 생성
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nginx spec: replicas: 5 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx name: nginx EOF
Deployment를 통해서 생성된 Pod들이 배포된 Node 확인
kubectl get pod -l app=nginx -o wide --sort-by=.spec.nodeName
EKS 설정창에서 Scheduler의 로깅을 활성화하면 CloudWatch를 통해서 아래와 같은 구성으로 kube-scheduler가 구동되는 것을 확인 가능
apiVersion: kubescheduler.config.k8s.io/v1beta1 clientConnection: acceptContentTypes: "" burst: 100 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/scheduler.conf qps: 50 enableContentionProfiling: true enableProfiling: true healthzBindAddress: 0.0.0.0:10251 kind: KubeSchedulerConfiguration leaderElection: leaderElect: true leaseDuration: 15s renewDeadline: 10s resourceLock: leases resourceName: kube-scheduler resourceNamespace: kube-system retryPeriod: 2s metricsBindAddress: 0.0.0.0:10251 parallelism: 16 percentageOfNodesToScore: 0 podInitialBackoffSeconds: 1 podMaxBackoffSeconds: 10 profiles: - pluginConfig: - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: DefaultPreemptionArgs minCandidateNodesAbsolute: 100 minCandidateNodesPercentage: 10 name: DefaultPreemption - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 hardPodAffinityWeight: 1 kind: InterPodAffinityArgs name: InterPodAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: NodeAffinityArgs name: NodeAffinity - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: NodeResourcesFitArgs name: NodeResourcesFit - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 kind: NodeResourcesLeastAllocatedArgs resources: - name: cpu weight: 1 - name: memory weight: 1 name: NodeResourcesLeastAllocated - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 defaultingType: System kind: PodTopologySpreadArgs name: PodTopologySpread - args: apiVersion: kubescheduler.config.k8s.io/v1beta1 bindTimeoutSeconds: 600 kind: VolumeBindingArgs name: VolumeBinding plugins: bind: enabled: - name: DefaultBinder weight: 0 filter: enabled: - name: NodeUnschedulable weight: 0 - name: NodeName weight: 0 - name: TaintToleration weight: 0 - name: NodeAffinity weight: 0 - name: NodePorts weight: 0 - name: NodeResourcesFit weight: 0 - name: VolumeRestrictions weight: 0 - name: EBSLimits weight: 0 - name: GCEPDLimits weight: 0 - name: NodeVolumeLimits weight: 0 - name: AzureDiskLimits weight: 0 - name: VolumeBinding weight: 0 - name: VolumeZone weight: 0 - name: PodTopologySpread weight: 0 - name: InterPodAffinity weight: 0 permit: {} postBind: {} postFilter: enabled: - name: DefaultPreemption weight: 0 preBind: enabled: - name: VolumeBinding weight: 0 preFilter: enabled: - name: NodeResourcesFit weight: 0 - name: NodePorts weight: 0 - name: PodTopologySpread weight: 0 - name: InterPodAffinity weight: 0 - name: VolumeBinding weight: 0 - name: NodeAffinity weight: 0 preScore: enabled: - name: InterPodAffinity weight: 0 - name: PodTopologySpread weight: 0 - name: TaintToleration weight: 0 - name: NodeAffinity weight: 0 queueSort: enabled: - name: PrioritySort weight: 0 reserve: enabled: - name: VolumeBinding weight: 0 score: enabled: - name: NodeResourcesBalancedAllocation weight: 1 - name: ImageLocality weight: 1 - name: InterPodAffinity weight: 1 - name: NodeResourcesLeastAllocated weight: 1 - name: NodeAffinity weight: 1 - name: NodePreferAvoidPods weight: 10000 - name: PodTopologySpread weight: 2 - name: TaintToleration weight: 1 schedulerName: default-scheduler
Deployment 삭제
kubectl delete deploy nginx
각 Node별로 배포된 Pod 갯수 확인
kubectl describe node | grep -E "(^Name:|^Non-terminated)"
podAntiAffinity를 명시하고 Pod 생성
for i in {1..5} do cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx-${i} labels: app: nginx spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - nginx topologyKey: kubernetes.io/hostname containers: - image: nginx name: nginx EOF done
위에서 생성한 Pod들이 배포된 Node 확인
kubectl get pod -l app=nginx -o wide --sort-by=.spec.nodeName
위에서 생성한 Pod 삭제
kubectl delete pod -l app=nginx
topologySpreadConstraints를 명시하고 Pod 생성
for i in {1..5} do cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx-${i} labels: app: nginx spec: topologySpreadConstraints: - maxSkew: 3 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: ScheduleAnyway labelSelector: matchLabels: app: nginx containers: - image: nginx name: nginx EOF done
위에서 생성한 Pod들이 배포된 Node 확인
kubectl get pod -l app=nginx -o wide --sort-by=.spec.nodeName
리소스 삭제
kubectl delete pod -l 'app in (nginx,myapp)'
Requests & Limits
Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx resources: requests: cpu: 2 EOF
Pod 상태 확인
kubectl get pod nginx
위의 명령어를 실행하면 아래와 같은 결과를 확인할수 있습니다.
NAME READY STATUS RESTARTS AGE nginx 0/1 Pending 0 8s
아래의 명령어를 사용해서 Pod가 배포되지 않는 이유를 찾으세요.
kubectl describe pod nginx
노드의 상태의 상세값들을 확인
kubectl get node -o=jsonpath='{.items[*].status}' | jq
위에서 확인한 값들중에서 allocatable.cpu가 무엇을 의미하는지 확인 - https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable
kubectl explain node.status.allocatable.cpu
노드별로 할당 가용한 CPU 리소스를 확인
kubectl get node -o custom-columns=NAME:.metadata.name,ALLOCATABLE_CPU:.status.allocatable.cpu
Pending 상태인 Pod를 삭제
kubectl delete pod nginx
노드에 할당 가용한 최대 CPU 값을 spec.container.resources.requests.cpu에 넣고 Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - image: nginx name: nginx resources: limits: cpu: $(kubectl get node -o=jsonpath='{.items[0].status.allocatable.cpu}') EOF
Pod가 배포되는지 확인하고 만약 배포되지 않는다면 그 이유를 확인
kubectl get pod nginx
Pod 삭제
kubectl delete pod nginx
하나의 노드에 키(Key)는 env 이고 값(Value)은 dev 인 Label 부여
kubectl label node \ $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') env=dev
Node에 Label이 부여되었는지 확인
kubectl get node --show-labels | grep -E 'env=dev|$'
Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: stress-1 labels: app: stress spec: containers: - image: lorel/docker-stress-ng name: stress-ng args: - --cpu - "2" nodeSelector: env: dev --- apiVersion: v1 kind: Pod metadata: name: stress-2 labels: app: stress spec: containers: - image: lorel/docker-stress-ng name: stress-ng args: - --cpu - "2" resources: requests: cpu: 0.5 limits: cpu: 1 nodeSelector: env: dev --- apiVersion: v1 kind: Pod metadata: name: stress-3 labels: app: stress spec: containers: - image: lorel/docker-stress-ng name: stress-ng args: - --cpu - "2" resources: limits: cpu: 1 nodeSelector: env: dev EOF
Pod가 생성됐는지 확인
kubectl get pod -l app=stress
Metric Server 설치
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Pod들의 CPU 사용량 확인
kubectl top pod -l app=stress --use-protocol-buffers
노드별로 할당 가용한 Memory 리소스 확인
kubectl get node \ -o custom-columns=NAME:.metadata.name,ALLOCATABLE_MEMORY:.status.allocatable.memory
Pod 재생성
cat <<EOF | kubectl replace --force --grace-period=0 -f - apiVersion: v1 kind: Pod metadata: name: stress-1 labels: app: stress spec: containers: - image: lorel/docker-stress-ng name: stress-ng args: - --vm - "1" - --vm-bytes - "1G" - --vm-method - all nodeSelector: env: dev --- apiVersion: v1 kind: Pod metadata: name: stress-2 labels: app: stress spec: containers: - image: lorel/docker-stress-ng name: stress-ng args: - --vm - "1" - --vm-bytes - "1G" - --vm-method - all resources: requests: memory: 200Mi limits: memory: 1Gi nodeSelector: env: dev --- apiVersion: v1 kind: Pod metadata: name: stress-3 labels: app: stress spec: containers: - image: lorel/docker-stress-ng name: stress-ng args: - --vm - "1" - --vm-bytes - "1G" - --vm-method - all resources: limits: cpu: 0.5 memory: 1Gi nodeSelector: env: dev EOF
Pod가 생성되었는지 확인
kubectl get pod -l app=stress
Metrics Server가 60초 마다 지표를 수집하기 때문에 2-3분 정도 대기 - https://github.com/kubernetes-sigs/metrics-server/blob/master/FAQ.md#how-often-metrics-are-scraped
sleep 180
Pod들의 메모리 사용량 확인
kubectl top pod -l app=stress --use-protocol-buffers
Pod가 배포된 Node에 발생한 Event 확인
kubectl describe node $(kubectl get pod -l app=stress \ -o=jsonpath='{.items[0].spec.nodeName}')
Pod의 상태가 OOMKilled가 될때까지 2-3분간 대기
sleep 180
Pod 상태 확인
kubectl get pod -l app=stress
상태가 Running이 아닌 Pod가 있으면 원인을 확인
kubectl get pod -l app=stress \ -o custom-columns=NAME:metadata.name,\ STATUS:status.phase,REASON:status.reason,MESSAGE:status.message
각 Pod별 QoS 클래스 확인
kubectl get pod -l app=stress \ -o custom-columns=NAME:.metadata.name,\ STATUS:status.phase,QoS:.status.qosClass
Pod가 배포된 Node에 발생한 Event 확인
kubectl describe node $(kubectl get pod -l app=stress \ -o=jsonpath='{.items[0].spec.nodeName}')
새로운 터미널을 열고 Pod가 배포된 Node의 인스턴스 ID를 확인하고 환경변수로 지정
{ export INSTANCE_ID=$(kubectl get node $(kubectl get pod -l app=stress \ -o=jsonpath='{.items[0].spec.nodeName}') \ -o jsonpath='{.spec.providerID}{"\n"}' | grep -oE "i-[a-z0-9]+") echo $INSTANCE_ID }
Pod가 배포된 Node로 Session Manager 연결
aws ssm start-session --target $INSTANCE_ID
커널 로그에서 OOM 관련된 내용이 있는지 확인 - https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/#node-out-of-memory-behavior
sudo tail -n 1000 /var/log/messages | grep -i oom
Session Manager 종료
exit
리소스 삭제
{ kubectl delete pod -l app=stress kubectl label node env- --all kubectl delete -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml }
Taints & Tolerations
Pod 생성
for i in {1..10} do cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: nginx-${i} labels: app: nginx spec: containers: - image: nginx name: nginx EOF done
Pod가 배포된 Node 확인
kubectl get pod -l app=nginx -o wide --sort-by=.spec.nodeName
Node 목록 확인
kubectl get node
첫번째 Node에 Taint 부여
kubectl taint nodes $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ status=unstable:PreferNoSchedule
Node에 부여된 Taint 확인
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,\ TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,\ TaintEffect:.spec.taints[*].effect
Pod 재생성
for i in {1..10} do kubectl get pod nginx-${i} -o yaml | kubectl replace --force -f - done
Pod가 배포된 Node 확인
kubectl get pod -l app=nginx -o wide --sort-by=.spec.nodeName
첫번째 Node에 부여한 Taint 삭제
kubectl taint nodes $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ status-
Node에 부여된 Taint 확인
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,\ TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,\ TaintEffect:.spec.taints[*].effect
두번째 Node에 Taint 부여
kubectl taint nodes $(kubectl get node -o=jsonpath='{.items[1].metadata.name}') \ status=maintenance:NoSchedule
Node에 부여된 Taint 확인
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,\ TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,\ TaintEffect:.spec.taints[*].effect
Pod 삭제
kubectl delete pod -l app=nginx
Deployment 생성
kubectl create deploy nginx --image=nginx --replicas=10
위에서 생성한 Deployment를 통해서 생성된 Pod가 배포된 Node 확인
kubectl get pod -l app=nginx -o wide --sort-by=.spec.nodeName
Deployment 재생성
cat <<EOF | kubectl replace --force -f - apiVersion: apps/v1 kind: Deployment metadata: name: nginx spec: replicas: 10 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: tolerations: - key: status operator: Equal value: maintenance effect: NoSchedule containers: - name: nginx image: nginx EOF
위에서 재생성한 Deployment를 통해서 생성된 Pod가 배포된 Node 확인
kubectl get pod -o wide --sort-by=.spec.nodeName
Node에 Taint가 부여되어 있고 PodSpec에 Toleration이 설정되어 있을 경우에 어떤 절차를 통해서 Pod가 Node에 스케쥴링 되는지 확인
두번째 Node에 부여한 Taint 삭제
kubectl taint nodes $(kubectl get node -o=jsonpath='{.items[1].metadata.name}') \ status-
Node에 부여된 Taint 확인
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,\ TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,\ TaintEffect:.spec.taints[*].effect
Deployment 생성
kubectl get deploy nginx -o yaml | kubectl replace --force -f -
위에서 재생성한 Deployment를 통해서 생성된 Pod가 배포된 Node 확인
kubectl get pod -o wide --sort-by=.spec.nodeName
첫번째 Node에 Taint 부여
kubectl taint nodes $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ status=upgrading:NoExecute
첫번째 Node에 Label 부여
kubectl label nodes $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ tainted=yes
Node에 부여된 Taint 확인
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,\ TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,\ TaintEffect:.spec.taints[*].effect
Taint가 부여된 Node에 배포되었던 Pod가 삭제되고 다른 Node에 새로운 Pod가 생겼는지 확인 - 리소스가 부족할 경우 Pending 상태가 될수도 있음
kubectl get pod -o wide --sort-by=.spec.nodeName
nodeSelector를 명시한 Pod 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: emergency spec: containers: - name: nginx image: nginx nodeSelector: tainted: "yes" EOF
Pod가 생성되는지 확인
kubectl get pod emergency
Pod가 생성되지 않으면 그 이유를 확인
kubectl describe pod emergency
Pod에 toleration 명시 - 60초 동안만 toleration 유지
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: emergency spec: containers: - name: nginx image: nginx nodeSelector: tainted: "yes" tolerations: - key: status operator: Equal value: upgrading effect: NoExecute tolerationSeconds: 60 EOF
Pod가 생성되는지 확인
kubectl get pod emergency
60초가 지난 후에도 Pod가 존재하는지 확인
kubectl get pod emergency
노드에 부여한 Taint 삭제
kubectl taint nodes $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ status-
생성된 Pod들이 Taint가 삭제된 Node로 옮겨가거나 Pending 상태의 Pod들이 Taint가 삭제된 Node에 생성되는지 확인
kubectl get pod -l app=nginx
Deployment 삭제
kubectl delete deployment nginx
Descheduler
Cluster Autoscaler 가 실행중인 경우에는 해당 실습이 정상적으로 수행되지 않을수도 있으니 아래의 명령어로 Cluster Autoscaler 정지
kubectl -n kube-system scale deployment cluster-autoscaler --replicas=0
Deployment 생성
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nginx spec: replicas: 6 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx name: nginx resources: requests: cpu: 100m memory: 100Mi EOF
생성된 Pod 확인
kubectl get pods -l app=nginx -o wide --sort-by=spec.nodeName
첫번째 Node에 있는 Pod를 다른 Node로 강제 이동
kubectl drain $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ --pod-selector=app=nginx
재배포된 Pod 확인
kubectl get pods -o wide --sort-by=spec.nodeName
Node에 부여된 Taint 확인 - 3번 명령어에서 명시된 Node에 node.kubernetes.io/unschedulable:NoSchedule Taint가 부여됨
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,\ TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,\ TaintEffect:.spec.taints[*].effect
Taint 제거
kubectl uncordon $(kubectl get node -o=jsonpath='{.items[0].metadata.name}')
Node에 부여된 Taint 확인
kubectl get nodes -o=custom-columns=NodeName:.metadata.name,\ TaintKey:.spec.taints[*].key,TaintValue:.spec.taints[*].value,\ TaintEffect:.spec.taints[*].effect
Descheduler 실행
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ConfigMap metadata: name: descheduler-policy-configmap namespace: kube-system data: policy.yaml: | apiVersion: descheduler/v1alpha1 kind: DeschedulerPolicy strategies: LowNodeUtilization: enabled: true params: nodeResourceUtilizationThresholds: thresholds: cpu: 30 memory: 30 pods: 50 targetThresholds: cpu: 70 memory: 70 pods: 80 --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: descheduler-cluster-role rules: - apiGroups: [""] resources: ["events"] verbs: ["create", "update"] - apiGroups: [""] resources: ["nodes"] verbs: ["get", "watch", "list"] - apiGroups: [""] resources: ["namespaces"] verbs: ["get", "list"] - apiGroups: [""] resources: ["pods"] verbs: ["get", "watch", "list", "delete"] - apiGroups: [""] resources: ["pods/eviction"] verbs: ["create"] - apiGroups: ["scheduling.k8s.io"] resources: ["priorityclasses"] verbs: ["get", "watch", "list"] --- apiVersion: v1 kind: ServiceAccount metadata: name: descheduler-sa namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: descheduler-cluster-role-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: descheduler-cluster-role subjects: - name: descheduler-sa kind: ServiceAccount namespace: kube-system --- apiVersion: batch/v1 kind: Job metadata: name: descheduler-job namespace: kube-system spec: parallelism: 1 completions: 1 template: metadata: name: descheduler-pod spec: priorityClassName: system-cluster-critical containers: - name: descheduler image: k8s.gcr.io/descheduler/descheduler:v0.21.0 volumeMounts: - mountPath: /policy-dir name: policy-volume command: - "/bin/descheduler" args: - "--policy-config-file" - "/policy-dir/policy.yaml" - "--v" - "3" securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true restartPolicy: "Never" serviceAccountName: descheduler-sa volumes: - name: policy-volume configMap: name: descheduler-policy-configmap EOF
Descheduler Job 상태 확인
kubectl get job descheduler-job -n kube-system
Descheduler Job 로그 확인
kubectl -n kube-system logs job/descheduler-job
재배포된 Pod 확인
kubectl get pods -o wide --sort-by="{.spec.nodeName}"
Descheduler Job 삭제
kubectl -n kube-system delete job descheduler-job
Deployment 삭제
kubectl delete deploy nginx
Node에 부여된 Label 확인
kubectl get node --show-labels
하나의 노드에 키(Key)는 env 이고 값(Value)은 test 인 Label 부여
kubectl label node $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ env=test
Label이 정상적으로 부여 됐는지 확인
kubectl get node --show-labels | grep env=test
Deployment 생성
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: env operator: In values: - test containers: - image: nginx name: nginx EOF
생성된 Pod가 env=test Label을 부여한 노드에 배포 됐는지 확인
kubectl get pods -o wide
15번에서
env=test
Label을 부여한 노드에서 해당 Label 삭제kubectl label node $(kubectl get node -o=jsonpath='{.items[0].metadata.name}') \ env-
Label이 정상적으로 삭제 되었는지 확인
kubectl get node --show-labels | grep env=test
기존에 생성된 Pod에 변경사항이 있는지 확인
kubectl get pods -o wide
다른 노드에 키(Key)는 env 이고 값(Value)은 test 인 Label 부여
kubectl label node $(kubectl get node -o=jsonpath='{.items[1].metadata.name}') \ env=test
Label이 정상적으로 부여 됐는지 확인
kubectl get node --show-labels | grep env=test
기존에 생성된 Pod에 변경사항이 있는지 확인
kubectl get pods -o wide
Descheduler 실행
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ConfigMap metadata: name: descheduler-policy-configmap namespace: kube-system data: policy.yaml: | apiVersion: descheduler/v1alpha1 kind: DeschedulerPolicy strategies: RemovePodsViolatingNodeAffinity: enabled: true params: nodeAffinityType: - requiredDuringSchedulingIgnoredDuringExecution --- apiVersion: batch/v1 kind: Job metadata: name: descheduler-job namespace: kube-system spec: parallelism: 1 completions: 1 template: metadata: name: descheduler-pod spec: priorityClassName: system-cluster-critical containers: - name: descheduler image: k8s.gcr.io/descheduler/descheduler:v0.21.0 volumeMounts: - mountPath: /policy-dir name: policy-volume command: - "/bin/descheduler" args: - "--policy-config-file" - "/policy-dir/policy.yaml" - "--v" - "3" securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true restartPolicy: "Never" serviceAccountName: descheduler-sa volumes: - name: policy-volume configMap: name: descheduler-policy-configmap EOF
Descheduler Job 상태 확인
kubectl get job descheduler-job -n kube-system
Descheduler Job 로그 확인
kubectl -n kube-system logs job/descheduler-job
Pod 상태 확인
kubectl get pods -o wide --sort-by="{.spec.nodeName}"
리소스 삭제
{ kubectl delete job descheduler-job -n kube-system kubectl delete sa descheduler-sa -n kube-system kubectl delete clusterrole descheduler-cluster-role kubectl delete clusterrolebinding descheduler-cluster-role-binding kubectl delete deploy nginx }
Last updated