실습

DaemonSet

  1. 현재 생성되어 있는 DaemonSet 확인

    kubectl get ds -A
  2. DaemonSet를 통해서 생성된 Pod들이 배포된 Node 확인

    kubectl get pod -l 'k8s-app in (kube-proxy, aws-node)' -A -o wide
  3. Node 1개 추가

    eksctl scale nodegroup --cluster=mycluster --nodes=3 nodegroup
  4. Node가 추가 되었는지 확인 - 생성 될때까지 시간이 걸릴수도 있음

    kubectl get node
  5. DaemonSet를 통해서 생성되는 Pod들이 새롭게 생성된 Node에 배포되었는지 확인

    kubectl get pod -l 'k8s-app in (kube-proxy, aws-node)' -A -o wide
  6. Node 갯수를 원래대로 조정

    eksctl scale nodegroup --cluster=mycluster --nodes=2 nodegroup
  7. Node가 삭제 되었는지 확인 - 삭제 될때까지 시간이 걸릴수도 있음

    kubectl get node

FluentBit

  1. Namespace 생성

    kubectl create ns amazon-cloudwatch
  2. 생성된 EKS 클러스터 정보를 포함하는 ConfigMap 생성

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: fluent-bit-cluster-info
      namespace: amazon-cloudwatch
    data:
      cluster.name: mycluster
      http.port: "2020"
      http.server: "On"
      logs.region: ap-northeast-2
      read.head: "Off"
      read.tail: "On"
    EOF
  3. 로그 수집에 필요한 권한 부여

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: fluent-bit
      namespace: amazon-cloudwatch
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: fluent-bit-role
    rules:
      - nonResourceURLs:
          - /metrics
        verbs:
          - get
      - apiGroups: [""]
        resources:
          - namespaces
          - pods
          - pods/logs
          - nodes
          - nodes/proxy
        verbs: ["get", "list", "watch"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: fluent-bit-role-binding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: fluent-bit-role
    subjects:
      - kind: ServiceAccount
        name: fluent-bit
        namespace: amazon-cloudwatch
    EOF
  4. FluentBit 설정 파일 생성

    cat <<'EOF' | kubectl apply -f -
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: fluent-bit-config
      namespace: amazon-cloudwatch
      labels:
        k8s-app: fluent-bit
    data:
      fluent-bit.conf: |
        [SERVICE]
            Flush                     5
            Log_Level                 info
            Daemon                    off
            Parsers_File              parsers.conf
            HTTP_Server               ${HTTP_SERVER}
            HTTP_Listen               0.0.0.0
            HTTP_Port                 ${HTTP_PORT}
            storage.path              /var/fluent-bit/state/flb-storage/
            storage.sync              normal
            storage.checksum          off
            storage.backlog.mem_limit 5M
            
        @INCLUDE application-log.conf
        @INCLUDE dataplane-log.conf
        @INCLUDE host-log.conf
      
      application-log.conf: |
        [INPUT]
            Name                tail
            Tag                 application.*
            Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
            Path                /var/log/containers/*.log
            Docker_Mode         On
            Docker_Mode_Flush   5
            Docker_Mode_Parser  container_firstline
            Parser              docker
            DB                  /var/fluent-bit/state/flb_container.db
            Mem_Buf_Limit       50MB
            Skip_Long_Lines     On
            Refresh_Interval    10
            Rotate_Wait         30
            storage.type        filesystem
            Read_from_Head      ${READ_FROM_HEAD}
    
        [INPUT]
            Name                tail
            Tag                 application.*
            Path                /var/log/containers/fluent-bit*
            Parser              docker
            DB                  /var/fluent-bit/state/flb_log.db
            Mem_Buf_Limit       5MB
            Skip_Long_Lines     On
            Refresh_Interval    10
            Read_from_Head      ${READ_FROM_HEAD}
    
        [INPUT]
            Name                tail
            Tag                 application.*
            Path                /var/log/containers/cloudwatch-agent*
            Docker_Mode         On
            Docker_Mode_Flush   5
            Docker_Mode_Parser  cwagent_firstline
            Parser              docker
            DB                  /var/fluent-bit/state/flb_cwagent.db
            Mem_Buf_Limit       5MB
            Skip_Long_Lines     On
            Refresh_Interval    10
            Read_from_Head      ${READ_FROM_HEAD}
    
        [FILTER]
            Name                kubernetes
            Match               application.*
            Kube_URL            https://kubernetes.default.svc:443
            Kube_Tag_Prefix     application.var.log.containers.
            Merge_Log           On
            Merge_Log_Key       log_processed
            K8S-Logging.Parser  On
            K8S-Logging.Exclude Off
            Labels              Off
            Annotations         Off
            Use_Kubelet         On
            Kubelet_Port        10250
            Buffer_Size         0
    
        [OUTPUT]
            Name                cloudwatch_logs
            Match               application.*
            region              ${AWS_REGION}
            log_group_name      /aws/containerinsights/${CLUSTER_NAME}/application
            log_stream_prefix   ${HOST_NAME}-
            auto_create_group   true
            extra_user_agent    container-insights
    
      dataplane-log.conf: |
        [INPUT]
            Name                systemd
            Tag                 dataplane.systemd.*
            Systemd_Filter      _SYSTEMD_UNIT=docker.service
            Systemd_Filter      _SYSTEMD_UNIT=kubelet.service
            DB                  /var/fluent-bit/state/systemd.db
            Path                /var/log/journal
            Read_From_Tail      ${READ_FROM_TAIL}
    
        [INPUT]
            Name                tail
            Tag                 dataplane.tail.*
            Path                /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
            Docker_Mode         On
            Docker_Mode_Flush   5
            Docker_Mode_Parser  container_firstline
            Parser              docker
            DB                  /var/fluent-bit/state/flb_dataplane_tail.db
            Mem_Buf_Limit       50MB
            Skip_Long_Lines     On
            Refresh_Interval    10
            Rotate_Wait         30
            storage.type        filesystem
            Read_from_Head      ${READ_FROM_HEAD}
    
        [FILTER]
            Name                modify
            Match               dataplane.systemd.*
            Rename              _HOSTNAME                   hostname
            Rename              _SYSTEMD_UNIT               systemd_unit
            Rename              MESSAGE                     message
            Remove_regex        ^((?!hostname|systemd_unit|message).)*$
    
        [FILTER]
            Name                aws
            Match               dataplane.*
            imds_version        v1
    
        [OUTPUT]
            Name                cloudwatch_logs
            Match               dataplane.*
            region              ${AWS_REGION}
            log_group_name      /aws/containerinsights/${CLUSTER_NAME}/dataplane
            log_stream_prefix   ${HOST_NAME}-
            auto_create_group   true
            extra_user_agent    container-insights
        
      host-log.conf: |
        [INPUT]
            Name                tail
            Tag                 host.dmesg
            Path                /var/log/dmesg
            Parser              syslog
            DB                  /var/fluent-bit/state/flb_dmesg.db
            Mem_Buf_Limit       5MB
            Skip_Long_Lines     On
            Refresh_Interval    10
            Read_from_Head      ${READ_FROM_HEAD}
    
        [INPUT]
            Name                tail
            Tag                 host.messages
            Path                /var/log/messages
            Parser              syslog
            DB                  /var/fluent-bit/state/flb_messages.db
            Mem_Buf_Limit       5MB
            Skip_Long_Lines     On
            Refresh_Interval    10
            Read_from_Head      ${READ_FROM_HEAD}
    
        [INPUT]
            Name                tail
            Tag                 host.secure
            Path                /var/log/secure
            Parser              syslog
            DB                  /var/fluent-bit/state/flb_secure.db
            Mem_Buf_Limit       5MB
            Skip_Long_Lines     On
            Refresh_Interval    10
            Read_from_Head      ${READ_FROM_HEAD}
    
        [FILTER]
            Name                aws
            Match               host.*
            imds_version        v1
    
        [OUTPUT]
            Name                cloudwatch_logs
            Match               host.*
            region              ${AWS_REGION}
            log_group_name      /aws/containerinsights/${CLUSTER_NAME}/host
            log_stream_prefix   ${HOST_NAME}.
            auto_create_group   true
            extra_user_agent    container-insights
    
      parsers.conf: |
        [PARSER]
            Name                docker
            Format              json
            Time_Key            time
            Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
    
        [PARSER]
            Name                syslog
            Format              regex
            Regex               ^(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
            Time_Key            time
            Time_Format         %b %d %H:%M:%S
    
        [PARSER]
            Name                container_firstline
            Format              regex
            Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
            Time_Key            time
            Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
    
        [PARSER]
            Name                cwagent_firstline
            Format              regex
            Regex               (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
            Time_Key            time
            Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
    EOF
  5. FluentBit 배포

    cat <<EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: fluent-bit
      namespace: amazon-cloudwatch
      labels:
        k8s-app: fluent-bit
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      selector:
        matchLabels:
          k8s-app: fluent-bit
      template:
        metadata:
          labels:
            k8s-app: fluent-bit
            version: v1
            kubernetes.io/cluster-service: "true"
        spec:
          containers:
          - name: fluent-bit
            image: public.ecr.aws/aws-observability/aws-for-fluent-bit:stable
            imagePullPolicy: Always
            env:
            - name: AWS_REGION
              valueFrom:
                configMapKeyRef:
                  name: fluent-bit-cluster-info
                  key: logs.region
            - name: CLUSTER_NAME
              valueFrom:
                configMapKeyRef:
                  name: fluent-bit-cluster-info
                  key: cluster.name
            - name: HTTP_SERVER
              valueFrom:
                configMapKeyRef:
                  name: fluent-bit-cluster-info
                  key: http.server
            - name: HTTP_PORT
              valueFrom:
                configMapKeyRef:
                  name: fluent-bit-cluster-info
                  key: http.port
            - name: READ_FROM_HEAD
              valueFrom:
                configMapKeyRef:
                  name: fluent-bit-cluster-info
                  key: read.head
            - name: READ_FROM_TAIL
              valueFrom:
                configMapKeyRef:
                  name: fluent-bit-cluster-info
                  key: read.tail
            - name: HOST_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: HOSTNAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: CI_VERSION
              value: "k8s/1.3.11"
            resources:
                limits:
                  memory: 200Mi
                requests:
                  cpu: 500m
                  memory: 100Mi
            volumeMounts:
            # Please don't change below read-only permissions
            - name: fluentbitstate
              mountPath: /var/fluent-bit/state
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: fluent-bit-config
              mountPath: /fluent-bit/etc/
            - name: runlogjournal
              mountPath: /run/log/journal
              readOnly: true
            - name: dmesg
              mountPath: /var/log/dmesg
              readOnly: true
          terminationGracePeriodSeconds: 10
          hostNetwork: true
          dnsPolicy: ClusterFirstWithHostNet
          volumes:
          - name: fluentbitstate
            hostPath:
              path: /var/fluent-bit/state
          - name: varlog
            hostPath:
              path: /var/log
          - name: varlibdockercontainers
            hostPath:
              path: /var/lib/docker/containers
          - name: fluent-bit-config
            configMap:
              name: fluent-bit-config
          - name: runlogjournal
            hostPath:
              path: /run/log/journal
          - name: dmesg
            hostPath:
              path: /var/log/dmesg
          serviceAccountName: fluent-bit
          tolerations:
          - key: node-role.kubernetes.io/master
            operator: Exists
            effect: NoSchedule
          - operator: "Exists"
            effect: "NoExecute"
          - operator: "Exists"
            effect: "NoSchedule"
    EOF
  6. 생성된 Pod 확인

    kubectl get pods -n amazon-cloudwatch
  7. FluentBit 설정에 명시한 아래와 같은 로그 그룹이 생성됐는지 확인

    /aws/containerinsights/mycluster/application
    /aws/containerinsights/mycluster/host
    /aws/containerinsights/mycluster/dataplane
    aws logs describe-log-groups \
    --log-group-name-prefix /aws/containerinsights/mycluster
  8. FluentBit 로그 확인

    kubectl -n amazon-cloudwatch logs ds/fluent-bit
  9. 한개의 Node의 인스턴스 ID를 확인하고 환경변수로 지정

    {
        export INSTANCE_ID=$(kubectl get node -o jsonpath='{.items[0].spec.providerID}' \
        | grep -oE "i-[a-z0-9]+")
        
        echo $INSTANCE_ID
    }
  10. Node에 부여된 IAM 인스턴스 프로필을 확인하고 환경변수로 지정

    {
        export INSTANCE_PROFILE=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID \
        --query 'Reservations[0].Instances[0].IamInstanceProfile.Arn' \
        --output text | grep -oE "[a-z0-9-]+$")
    
        echo $INSTANCE_PROFILE
    }
  11. 위에서 확인한 인스턴스 프로필에 연동된 IAM 역할을 확인하고 환경변수로 지정

    {
        export ROLE_NAME=$(aws iam get-instance-profile --instance-profile-name $INSTANCE_PROFILE \
        --query 'InstanceProfile.Roles[0].RoleName' --output text)
        
        echo $ROLE_NAME
    }
  12. 위에서 확인한 IAM 역할에 부여된 정책 확인

    aws iam list-attached-role-policies --role-name $ROLE_NAME
  13. IAM 역할에 CloudWatch Logs 권한 부여

    aws iam attach-role-policy --role-name $ROLE_NAME \
    --policy-arn arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
  14. IAM 역할에 부여된 정책 확인

    aws iam list-attached-role-policies --role-name $ROLE_NAME
  15. FluentBit 로그 확인

    kubectl -n amazon-cloudwatch logs ds/fluent-bit
  16. FluentBit 설정에 명시한 로그 그룹들이 생성됐는지 확인

    aws logs describe-log-groups \
    --log-group-name-prefix /aws/containerinsights/mycluster
  17. Pod 생성

    kubectl run nginx --image=nginx
  18. 위에서 생성한 Pod 로그 확인

    kubectl logs nginx
  19. /aws/containerinsight/mycluster/application 로그 그룹에 위에서 생성한 Pod의 로그가 추가 되는지 확인

    aws logs describe-log-streams \
    --log-group-name /aws/containerinsights/mycluster/application
  20. 위에서 생성한 Pod 로그가 전송된 로그 스트림을 확인하고 환경변수로 지정

    {
        export LOG_STREAM=$(aws logs describe-log-streams \
        --log-group-name /aws/containerinsights/mycluster/application \
        --query 'logStreams[*].logStreamName' --output text \
        | grep -oE "[a-z0-9.-]+nginx[a-z0-9._-]+")
    
        echo $LOG_STREAM
    }
  21. 위에서 확인한 로그 스트림에 전송된 로그 확인

    aws logs get-log-events \
    --log-group-name /aws/containerinsights/mycluster/application \
    --log-stream-name $LOG_STREAM --query 'events[*].message'
  22. 생성한 리소스 삭제

    {
        kubectl delete pod nginx
        kubectl delete ns amazon-cloudwatch
        kubectl delete clusterrole fluent-bit-role
        kubectl delete clusterrolebinding fluent-bit-role-binding
        aws logs delete-log-group --log-group-name /aws/containerinsights/mycluster/application
        aws logs delete-log-group --log-group-name /aws/containerinsights/mycluster/host
        aws logs delete-log-group --log-group-name /aws/containerinsights/mycluster/dataplane
        aws iam detach-role-policy --role-name $ROLE_NAME --policy-arn arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
    }

Node Exporter

  1. Node Exporter 설치 가이드 - https://prometheus.io/docs/guides/node-exporter

  2. Namespace 생성

    kubectl create ns monitoring
  3. Node Exporter 설치

    cat <<EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: node-exporter
      labels:     
        app: node-exporter
      namespace: monitoring
    spec:
      selector:
        matchLabels:
          app: node-exporter
      template:
        metadata:
          labels:         
            app: node-exporter
          annotations:
            prometheus.io/scrape: "true"
            prometheus.io/path: "/metrics"
            prometheus.io/port: "9100"
        spec:
          hostNetwork: true
          hostPID: true
          containers:
            - name: node-exporter
              image: quay.io/prometheus/node-exporter
              args:
                - --path.procfs=/host/proc
                - --path.sysfs=/host/sys
                - --path.rootfs=/host/root
                - --web.listen-address=0.0.0.0:9100
              ports:
                - name: metrics
                  containerPort: 9100
                  protocol: TCP
              volumeMounts:
                - name: proc
                  mountPath: /host/proc
                  readOnly: true
                - name: sys
                  mountPath: /host/sys
                  readOnly: true
                - name: root
                  mountPath: /host/root
                  mountPropagation: HostToContainer
                  readOnly: true
          volumes:
            - name: proc
              hostPath:
                path: /proc
            - name: sys
              hostPath:
                path: /sys
            - name: root
              hostPath:
                path: /
    EOF
  4. Node Exporter가 실행중인지 확인

    kubectl -n monitoring get pod -l app=node-exporter
  5. Node Exporter가 내보내는 지표 확인

    kubectl run nginx --image=nginx -it --rm --restart=Never \
    -- curl $(kubectl -n monitoring get pod -l app=node-exporter -o=jsonpath="{.items[0].status.podIP}"):9100/metrics
  6. Node Exporter 실행옵션 확인

    kubectl -n monitoring exec ds/node-exporter -- node_exporter -h
  7. Node Exporter 삭제

    kubectl delete ds node-exporter -n monitoring

Last updated