실습
DaemonSet
현재 생성되어 있는 DaemonSet 확인
kubectl get ds -A
DaemonSet를 통해서 생성된 Pod들이 배포된 Node 확인
kubectl get pod -l 'k8s-app in (kube-proxy, aws-node)' -A -o wide
Node 1개 추가
eksctl scale nodegroup --cluster=mycluster --nodes=3 nodegroup
Node가 추가 되었는지 확인 - 생성 될때까지 시간이 걸릴수도 있음
kubectl get node
DaemonSet를 통해서 생성되는 Pod들이 새롭게 생성된 Node에 배포되었는지 확인
kubectl get pod -l 'k8s-app in (kube-proxy, aws-node)' -A -o wide
Node 갯수를 원래대로 조정
eksctl scale nodegroup --cluster=mycluster --nodes=2 nodegroup
Node가 삭제 되었는지 확인 - 삭제 될때까지 시간이 걸릴수도 있음
kubectl get node
FluentBit
Namespace 생성
kubectl create ns amazon-cloudwatch
생성된 EKS 클러스터 정보를 포함하는 ConfigMap 생성
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ConfigMap metadata: name: fluent-bit-cluster-info namespace: amazon-cloudwatch data: cluster.name: mycluster http.port: "2020" http.server: "On" logs.region: ap-northeast-2 read.head: "Off" read.tail: "On" EOF
로그 수집에 필요한 권한 부여
cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: name: fluent-bit namespace: amazon-cloudwatch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: fluent-bit-role rules: - nonResourceURLs: - /metrics verbs: - get - apiGroups: [""] resources: - namespaces - pods - pods/logs - nodes - nodes/proxy verbs: ["get", "list", "watch"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: fluent-bit-role-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: fluent-bit-role subjects: - kind: ServiceAccount name: fluent-bit namespace: amazon-cloudwatch EOF
FluentBit 설정 파일 생성
cat <<'EOF' | kubectl apply -f - apiVersion: v1 kind: ConfigMap metadata: name: fluent-bit-config namespace: amazon-cloudwatch labels: k8s-app: fluent-bit data: fluent-bit.conf: | [SERVICE] Flush 5 Log_Level info Daemon off Parsers_File parsers.conf HTTP_Server ${HTTP_SERVER} HTTP_Listen 0.0.0.0 HTTP_Port ${HTTP_PORT} storage.path /var/fluent-bit/state/flb-storage/ storage.sync normal storage.checksum off storage.backlog.mem_limit 5M @INCLUDE application-log.conf @INCLUDE dataplane-log.conf @INCLUDE host-log.conf application-log.conf: | [INPUT] Name tail Tag application.* Exclude_Path /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy* Path /var/log/containers/*.log Docker_Mode On Docker_Mode_Flush 5 Docker_Mode_Parser container_firstline Parser docker DB /var/fluent-bit/state/flb_container.db Mem_Buf_Limit 50MB Skip_Long_Lines On Refresh_Interval 10 Rotate_Wait 30 storage.type filesystem Read_from_Head ${READ_FROM_HEAD} [INPUT] Name tail Tag application.* Path /var/log/containers/fluent-bit* Parser docker DB /var/fluent-bit/state/flb_log.db Mem_Buf_Limit 5MB Skip_Long_Lines On Refresh_Interval 10 Read_from_Head ${READ_FROM_HEAD} [INPUT] Name tail Tag application.* Path /var/log/containers/cloudwatch-agent* Docker_Mode On Docker_Mode_Flush 5 Docker_Mode_Parser cwagent_firstline Parser docker DB /var/fluent-bit/state/flb_cwagent.db Mem_Buf_Limit 5MB Skip_Long_Lines On Refresh_Interval 10 Read_from_Head ${READ_FROM_HEAD} [FILTER] Name kubernetes Match application.* Kube_URL https://kubernetes.default.svc:443 Kube_Tag_Prefix application.var.log.containers. Merge_Log On Merge_Log_Key log_processed K8S-Logging.Parser On K8S-Logging.Exclude Off Labels Off Annotations Off Use_Kubelet On Kubelet_Port 10250 Buffer_Size 0 [OUTPUT] Name cloudwatch_logs Match application.* region ${AWS_REGION} log_group_name /aws/containerinsights/${CLUSTER_NAME}/application log_stream_prefix ${HOST_NAME}- auto_create_group true extra_user_agent container-insights dataplane-log.conf: | [INPUT] Name systemd Tag dataplane.systemd.* Systemd_Filter _SYSTEMD_UNIT=docker.service Systemd_Filter _SYSTEMD_UNIT=kubelet.service DB /var/fluent-bit/state/systemd.db Path /var/log/journal Read_From_Tail ${READ_FROM_TAIL} [INPUT] Name tail Tag dataplane.tail.* Path /var/log/containers/aws-node*, /var/log/containers/kube-proxy* Docker_Mode On Docker_Mode_Flush 5 Docker_Mode_Parser container_firstline Parser docker DB /var/fluent-bit/state/flb_dataplane_tail.db Mem_Buf_Limit 50MB Skip_Long_Lines On Refresh_Interval 10 Rotate_Wait 30 storage.type filesystem Read_from_Head ${READ_FROM_HEAD} [FILTER] Name modify Match dataplane.systemd.* Rename _HOSTNAME hostname Rename _SYSTEMD_UNIT systemd_unit Rename MESSAGE message Remove_regex ^((?!hostname|systemd_unit|message).)*$ [FILTER] Name aws Match dataplane.* imds_version v1 [OUTPUT] Name cloudwatch_logs Match dataplane.* region ${AWS_REGION} log_group_name /aws/containerinsights/${CLUSTER_NAME}/dataplane log_stream_prefix ${HOST_NAME}- auto_create_group true extra_user_agent container-insights host-log.conf: | [INPUT] Name tail Tag host.dmesg Path /var/log/dmesg Parser syslog DB /var/fluent-bit/state/flb_dmesg.db Mem_Buf_Limit 5MB Skip_Long_Lines On Refresh_Interval 10 Read_from_Head ${READ_FROM_HEAD} [INPUT] Name tail Tag host.messages Path /var/log/messages Parser syslog DB /var/fluent-bit/state/flb_messages.db Mem_Buf_Limit 5MB Skip_Long_Lines On Refresh_Interval 10 Read_from_Head ${READ_FROM_HEAD} [INPUT] Name tail Tag host.secure Path /var/log/secure Parser syslog DB /var/fluent-bit/state/flb_secure.db Mem_Buf_Limit 5MB Skip_Long_Lines On Refresh_Interval 10 Read_from_Head ${READ_FROM_HEAD} [FILTER] Name aws Match host.* imds_version v1 [OUTPUT] Name cloudwatch_logs Match host.* region ${AWS_REGION} log_group_name /aws/containerinsights/${CLUSTER_NAME}/host log_stream_prefix ${HOST_NAME}. auto_create_group true extra_user_agent container-insights parsers.conf: | [PARSER] Name docker Format json Time_Key time Time_Format %Y-%m-%dT%H:%M:%S.%LZ [PARSER] Name syslog Format regex Regex ^(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$ Time_Key time Time_Format %b %d %H:%M:%S [PARSER] Name container_firstline Format regex Regex (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=}) Time_Key time Time_Format %Y-%m-%dT%H:%M:%S.%LZ [PARSER] Name cwagent_firstline Format regex Regex (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=}) Time_Key time Time_Format %Y-%m-%dT%H:%M:%S.%LZ EOF
FluentBit 배포
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: DaemonSet metadata: name: fluent-bit namespace: amazon-cloudwatch labels: k8s-app: fluent-bit version: v1 kubernetes.io/cluster-service: "true" spec: selector: matchLabels: k8s-app: fluent-bit template: metadata: labels: k8s-app: fluent-bit version: v1 kubernetes.io/cluster-service: "true" spec: containers: - name: fluent-bit image: public.ecr.aws/aws-observability/aws-for-fluent-bit:stable imagePullPolicy: Always env: - name: AWS_REGION valueFrom: configMapKeyRef: name: fluent-bit-cluster-info key: logs.region - name: CLUSTER_NAME valueFrom: configMapKeyRef: name: fluent-bit-cluster-info key: cluster.name - name: HTTP_SERVER valueFrom: configMapKeyRef: name: fluent-bit-cluster-info key: http.server - name: HTTP_PORT valueFrom: configMapKeyRef: name: fluent-bit-cluster-info key: http.port - name: READ_FROM_HEAD valueFrom: configMapKeyRef: name: fluent-bit-cluster-info key: read.head - name: READ_FROM_TAIL valueFrom: configMapKeyRef: name: fluent-bit-cluster-info key: read.tail - name: HOST_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: HOSTNAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: CI_VERSION value: "k8s/1.3.11" resources: limits: memory: 200Mi requests: cpu: 500m memory: 100Mi volumeMounts: # Please don't change below read-only permissions - name: fluentbitstate mountPath: /var/fluent-bit/state - name: varlog mountPath: /var/log readOnly: true - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true - name: fluent-bit-config mountPath: /fluent-bit/etc/ - name: runlogjournal mountPath: /run/log/journal readOnly: true - name: dmesg mountPath: /var/log/dmesg readOnly: true terminationGracePeriodSeconds: 10 hostNetwork: true dnsPolicy: ClusterFirstWithHostNet volumes: - name: fluentbitstate hostPath: path: /var/fluent-bit/state - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: fluent-bit-config configMap: name: fluent-bit-config - name: runlogjournal hostPath: path: /run/log/journal - name: dmesg hostPath: path: /var/log/dmesg serviceAccountName: fluent-bit tolerations: - key: node-role.kubernetes.io/master operator: Exists effect: NoSchedule - operator: "Exists" effect: "NoExecute" - operator: "Exists" effect: "NoSchedule" EOF
생성된 Pod 확인
kubectl get pods -n amazon-cloudwatch
FluentBit 설정에 명시한 아래와 같은 로그 그룹이 생성됐는지 확인
/aws/containerinsights/mycluster/application /aws/containerinsights/mycluster/host /aws/containerinsights/mycluster/dataplane
aws logs describe-log-groups \ --log-group-name-prefix /aws/containerinsights/mycluster
FluentBit 로그 확인
kubectl -n amazon-cloudwatch logs ds/fluent-bit
한개의 Node의 인스턴스 ID를 확인하고 환경변수로 지정
{ export INSTANCE_ID=$(kubectl get node -o jsonpath='{.items[0].spec.providerID}' \ | grep -oE "i-[a-z0-9]+") echo $INSTANCE_ID }
Node에 부여된 IAM 인스턴스 프로필을 확인하고 환경변수로 지정
{ export INSTANCE_PROFILE=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID \ --query 'Reservations[0].Instances[0].IamInstanceProfile.Arn' \ --output text | grep -oE "[a-z0-9-]+$") echo $INSTANCE_PROFILE }
위에서 확인한 인스턴스 프로필에 연동된 IAM 역할을 확인하고 환경변수로 지정
{ export ROLE_NAME=$(aws iam get-instance-profile --instance-profile-name $INSTANCE_PROFILE \ --query 'InstanceProfile.Roles[0].RoleName' --output text) echo $ROLE_NAME }
위에서 확인한 IAM 역할에 부여된 정책 확인
aws iam list-attached-role-policies --role-name $ROLE_NAME
IAM 역할에 CloudWatch Logs 권한 부여
aws iam attach-role-policy --role-name $ROLE_NAME \ --policy-arn arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
IAM 역할에 부여된 정책 확인
aws iam list-attached-role-policies --role-name $ROLE_NAME
FluentBit 로그 확인
kubectl -n amazon-cloudwatch logs ds/fluent-bit
FluentBit 설정에 명시한 로그 그룹들이 생성됐는지 확인
aws logs describe-log-groups \ --log-group-name-prefix /aws/containerinsights/mycluster
Pod 생성
kubectl run nginx --image=nginx
위에서 생성한 Pod 로그 확인
kubectl logs nginx
/aws/containerinsight/mycluster/application
로그 그룹에 위에서 생성한 Pod의 로그가 추가 되는지 확인aws logs describe-log-streams \ --log-group-name /aws/containerinsights/mycluster/application
위에서 생성한 Pod 로그가 전송된 로그 스트림을 확인하고 환경변수로 지정
{ export LOG_STREAM=$(aws logs describe-log-streams \ --log-group-name /aws/containerinsights/mycluster/application \ --query 'logStreams[*].logStreamName' --output text \ | grep -oE "[a-z0-9.-]+nginx[a-z0-9._-]+") echo $LOG_STREAM }
위에서 확인한 로그 스트림에 전송된 로그 확인
aws logs get-log-events \ --log-group-name /aws/containerinsights/mycluster/application \ --log-stream-name $LOG_STREAM --query 'events[*].message'
생성한 리소스 삭제
{ kubectl delete pod nginx kubectl delete ns amazon-cloudwatch kubectl delete clusterrole fluent-bit-role kubectl delete clusterrolebinding fluent-bit-role-binding aws logs delete-log-group --log-group-name /aws/containerinsights/mycluster/application aws logs delete-log-group --log-group-name /aws/containerinsights/mycluster/host aws logs delete-log-group --log-group-name /aws/containerinsights/mycluster/dataplane aws iam detach-role-policy --role-name $ROLE_NAME --policy-arn arn:aws:iam::aws:policy/CloudWatchLogsFullAccess }
Node Exporter
Node Exporter 설치 가이드 - https://prometheus.io/docs/guides/node-exporter
Node Exporter GitHub - https://github.com/prometheus/node_exporter
Namespace 생성
kubectl create ns monitoring
Node Exporter 설치
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: DaemonSet metadata: name: node-exporter labels: app: node-exporter namespace: monitoring spec: selector: matchLabels: app: node-exporter template: metadata: labels: app: node-exporter annotations: prometheus.io/scrape: "true" prometheus.io/path: "/metrics" prometheus.io/port: "9100" spec: hostNetwork: true hostPID: true containers: - name: node-exporter image: quay.io/prometheus/node-exporter args: - --path.procfs=/host/proc - --path.sysfs=/host/sys - --path.rootfs=/host/root - --web.listen-address=0.0.0.0:9100 ports: - name: metrics containerPort: 9100 protocol: TCP volumeMounts: - name: proc mountPath: /host/proc readOnly: true - name: sys mountPath: /host/sys readOnly: true - name: root mountPath: /host/root mountPropagation: HostToContainer readOnly: true volumes: - name: proc hostPath: path: /proc - name: sys hostPath: path: /sys - name: root hostPath: path: / EOF
Node Exporter가 실행중인지 확인
kubectl -n monitoring get pod -l app=node-exporter
Node Exporter가 내보내는 지표 확인
kubectl run nginx --image=nginx -it --rm --restart=Never \ -- curl $(kubectl -n monitoring get pod -l app=node-exporter -o=jsonpath="{.items[0].status.podIP}"):9100/metrics
Node Exporter 실행옵션 확인
kubectl -n monitoring exec ds/node-exporter -- node_exporter -h
Node Exporter 삭제
kubectl delete ds node-exporter -n monitoring
Last updated