How to install elasticsearch, kibana and fluentd in k8s

k8s cluster provides a convinient environment for elastic search cluster fronted with redundant kibana deployment and fed k8s logs from a fluentd daemon set. Here are just working stateful set manifest for elastic search cluster securely running on three worker nodes, deployment manifest with 2 replicas for kibana dashboard and daemonset manifest for fluentd collecting selected k8s logs from worker nodes.

Step-by-step guide

1. First review the elasticsearch statefulset manifest and update as needed

 apiVersion: v1
 kind: Service
 metadata:
   name: elastic
 spec:
   type: "ClusterIP"
   clusterIP: None
   selector:
     app: elastic
   ports:
   - port: 9200
   sessionAffinity: "ClientIP"
   sessionAffinityConfig:
     clientIP:
       timeoutSeconds: 3600
 ---
 apiVersion: apps/v1
 kind: StatefulSet
 metadata:
   name: elastic
   labels:
     app: elastic
 spec:
 spec:
   podManagementPolicy: OrderedReady
   replicas: 3
   revisionHistoryLimit: 10
   selector:
     matchLabels:
       app: elastic
   serviceName: elastic
   template:
     metadata:
       labels:
         app: elastic
       name: elastic
     spec:
       affinity:
         podAntiAffinity:
           preferredDuringSchedulingIgnoredDuringExecution:
           - podAffinityTerm:
               labelSelector:
                 matchExpressions:
                 - key: app
                   operator: In
                   values:
                   - elastic
               topologyKey: kubernetes.io/hostname
             weight: 50
       containers:
       - image: docker.elastic.co/elasticsearch/elasticsearch:8.4.1
         imagePullPolicy: IfNotPresent
         name: elastic
         ports:
         - containerPort: 9300
           name: elastic-cluster
           protocol: TCP
         - containerPort: 9200
           name: elastic-http
           protocol: TCP
         resources: {}
         terminationMessagePath: /dev/termination-log
         terminationMessagePolicy: File
         volumeMounts:
         - mountPath: /usr/share
           name: elastic
         - mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
           name: elasticsearch-yml
           subPath: elasticsearch.yml
         - mountPath: /usr/share/elasticsearch/config/log4j2.properties
           name: log4j2-properties
           subPath: log4j2.properties
         - mountPath: /usr/share/elasticsearch/config/certs
           name: elastic-cert
         - mountPath: /usr/share/elasticsearch/config/jvm.options.d
           name: elastic-jvm-options
         - mountPath: /usr/share/elasticsearch/config/ca.pem
           name: trusted-ca
           subPath: internal_ca_chain.pem
         - mountPath: /usr/share/elasticsearch/jdk/conf/security/java.security
           name: java-security
           subPath: java.security
       dnsPolicy: ClusterFirst
       initContainers:
       - args:
         - sh
         - -c
         - if [ ! -d /mnt/elasticsearch ]; then cp -p -d -r /usr/share/elasticsearch /mnt/; touch /mnt/elasticsearch/config/ca.pem; mkdir /mnt/elasticsearch/config/certs; fi
         image: docker.elastic.co/elasticsearch/elasticsearch:8.4.1
         imagePullPolicy: IfNotPresent
         name: elastic-cp
         resources: {}
         terminationMessagePath: /dev/termination-log
         terminationMessagePolicy: File
         volumeMounts:
         - mountPath: /mnt
           name: elastic
       restartPolicy: Always
       schedulerName: default-scheduler
       securityContext: {}
       setHostnameAsFQDN: true
       subdomain: elastic
       terminationGracePeriodSeconds: 30
       volumes:
       - configMap:
           defaultMode: 420
           name: elasticsearch-yml
         name: elasticsearch-yml
       - name: elastic-cert
         secret:
           defaultMode: 420
           secretName: elastic-cert
       - configMap:
           defaultMode: 420
           name: internal-ca-chain
         name: trusted-ca
       - configMap:
           defaultMode: 420
           name: elastic-log4j2-properties
         name: log4j2-properties
       - configMap:
           defaultMode: 420
           name: elastic-java-security
         name: java-security
       - configMap:
           defaultMode: 420
           name: elastic-jvm-options
         name: elastic-jvm-options
   updateStrategy:
     rollingUpdate:
       partition: 0
     type: RollingUpdate
   volumeClaimTemplates:
   - apiVersion: v1
     kind: PersistentVolumeClaim
     metadata:
       creationTimestamp: null
       name: elastic
     spec:
       accessModes:
       - ReadWriteOnce
       resources:
         limits:
           storage: 20Gi
         requests:
           storage: 10Gi
       storageClassName: managed-nfs-storage
       volumeMode: Filesystem

Note various volume mounts of configuration files and the initContainer, which copies elasticsearch files to an nfs volume for persistence. Also, the headless service for stateful set (ClusterIP: None)

2. Create necessary config maps populated from the edited configuration files: elasticsearch.yml, java.security, jvm.options, log4j2.properties and trusted CA certs. Create a tls secret for elastic search SSL certs. The problem with cert validation is that elastic search communicates using IPs and does not rely on reverse DNS lookups, hence it requires the certs to carry matching IP addresses, which is problematic in dynamic k8s environment. Therefore, it is necessary to use verification_mode: certificate in elasticseach.yml file, which will only validate that the cert is issued by trusted CA.

3. elasticsearch.yml

 cluster.name: "es-cluster"
 network.host: "_eth0:ipv4_"
  
 node.store.allow_mmap: false
  
 discovery.type: multi-node
 discovery.seed_hosts: elastic-0.elastic.default.svc.cluster.local,elastic-1.elastic.default.svc.cluster.local,elastic-2.elastic.default.svc.cluster.local
  
 #uncomment for the first start, then comment out
 #cluster.initial_master_nodes: elastic-0.elastic.default.svc.cluster.local,elastic-1.elastic.default.svc.cluster.local,elastic-2.elastic.default.svc.cluster.local
  
 xpack.license.self_generated.type: basic
 xpack.security.autoconfiguration.enabled: false
 xpack.security.enabled: true
 xpack.security.enrollment.enabled: true
  
 xpack.security.http.ssl:
   enabled: true
   key: certs/tls.key
   certificate: certs/tls.crt
   certificate_authorities: ca.pem
   verification_mode: certificate
  
 xpack.security.transport.ssl:
   enabled: true
   key: certs/tls.key
   certificate: certs/tls.crt
   certificate_authorities: ca.pem
   verification_mode: certificate

4. jvm.options

 -Xms4g
 -Xmx4g
 -XX:MaxDirectMemorySize=4g

5. Deploy elastic search

 kubectl apply -f elastic.yaml

6. Reset password for “elastic” and “kibana_system” users in all pods

 kubectl exec -ti elastic-0 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -b -u elastic
 kubectl exec -ti elastic-1 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u elastic
 kubectl exec -ti elastic-2 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u elastic
 kubectl exec -ti elastic-0 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -b -u kibana_system
 kubectl exec -ti elastic-1 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u kibana_system
 kubectl exec -ti elastic-2 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u kibana_system

7. Create a secret elastic-user in kube-system namespace for use by fluentd

 kubectl -n kube-system create secret generic elastic-user --from-literal username=elastic --from-literal password=****

8. Review deployment manifest and kibana.yml config file for kibana dashboard

 apiVersion: networking.k8s.io/v1
 kind: Ingress
 metadata:
   annotations:
     ingress.kubernetes.io/ssl-passthrough: "true"
   name: kibana
 spec:
   rules:
   - host: kibana.example.com
     http:
       paths:
       - backend:
           service:
             name: kibana
             port:
               number: 443
         pathType: ImplementationSpecific
 ---
 apiVersion: v1
 kind: Service
 metadata:
   name: kibana
 spec:
   internalTrafficPolicy: Cluster
   ipFamilies:
   - IPv4
   ipFamilyPolicy: SingleStack
   ports:
   - port: 443
     protocol: TCP
     targetPort: 5601
   selector:
     app: kibana
   sessionAffinity: ClientIP
   sessionAffinityConfig:
     clientIP:
       timeoutSeconds: 3600
   type: ClusterIP
 ---
 apiVersion: apps/v1
 kind: Deployment
 metadata:
   labels:
     app: kibana
   name: kibana
   namespace: default
 spec:
   progressDeadlineSeconds: 600
   replicas: 2
   revisionHistoryLimit: 10
   selector:
     matchLabels:
       app: kibana
   strategy:
     rollingUpdate:
       maxSurge: 25%
       maxUnavailable: 25%
     type: RollingUpdate
   template:
     metadata:
       labels:
         app: kibana
     spec:
       affinity:
         podAntiAffinity:
           preferredDuringSchedulingIgnoredDuringExecution:
           - podAffinityTerm:
               labelSelector:
                 matchExpressions:
                 - key: app
                   operator: In
                   values:
                   - kibana
               topologyKey: "kubernetes.io/hostname"
             weight: 50
       containers:
       - image: docker.elastic.co/kibana/kibana:8.4.1
         imagePullPolicy: IfNotPresent
         name: kibana
         ports:
         - containerPort: 5601
           protocol: TCP
         resources: {}
         terminationMessagePath: /dev/termination-log
         terminationMessagePolicy: File
         volumeMounts:
         - mountPath: /usr/share/kibana/config/kibana.yml
           name: kibana-yml
           subPath: kibana.yml
         - mountPath: /etc/ssl/certs
           name: kibana-cert
         - mountPath: /usr/share/kibana/config/ca.pem
           name: trusted-ca
           subPath: internal_ca_chain.pem
       dnsPolicy: ClusterFirst
       restartPolicy: Always
       schedulerName: default-scheduler
       securityContext: {}
       terminationGracePeriodSeconds: 30
       volumes:
       - name: kibana-yml
         secret:
           defaultMode: 420
           secretName: kibana-yml
       - name: kibana-cert
         secret:
           defaultMode: 420
           secretName: kibana-cert
       - configMap:
           defaultMode: 420
           name: internal-ca-chain
         name: trusted-ca

9. Create a tls secret for kibana SSL cert - kibana-cert

10. Create kibana-yml secret since it contains a kibana_system user password from kibana.yml file

 server.host: "0.0.0.0"
 server.shutdownTimeout: "5s"
 server.uuid: "bfdfcf1e-3ced-4e96-ab52-4dc218979aa3"
 server.name: "kibana.default.svc.cluster.local"
 server.publicBaseUrl: "https://kibana.example.com"
 server.ssl.enabled: true
 server.ssl.certificate: "/etc/ssl/certs/tls.crt"
 server.ssl.key: "/etc/ssl/certs/tls.key"
 server.ssl.certificateAuthorities: "/usr/share/kibana/config/ca.pem"
 elasticsearch.ssl.certificateAuthorities: "/usr/share/kibana/config/ca.pem"
 elasticsearch.hosts: [ "https://elastic-0.elastic.default.svc.cluster.local:9200", "https://elastic-1.elastic.default.svc.cluster.local:9200", "https://elastic-2.elastic.default.svc.cluster.local:9200" ]
 xpack.reporting.kibanaServer.hostname: "kibana.default.svc.cluster.local"
  
 elasticsearch.username: "kibana_system"
 elasticsearch.password: "*****"
  
 logging.root.level: "warn"

11. Deploy Kibana dashboard

 kubectl apply -f kibana.yaml

12. Create a DNS record for kibana.example.com and log in Kibana https://kibana.example.com using elastic username and password (see above)

13. Create index lifecycle policy “kubernetes-logging” and index template “kubernetes-logging” using index pattern “kubernetes-logs*”. Enable Data Stream Assign the lifecycle policy to this index template in Settings

 "index": {
     "lifecycle": {
       "name": "kubernetes-logging"
     }
   }

14. Assign an alias (I think this is optional with Data Stream) to the index template “kubernetes-logging”

 "my-logs": {}

15. If needed, create an Ingest Pipeline and assign it to the index template as well under settings index.defaul_pipeline

16. Create a view user in kibana.

17. Review fluentd.yaml manifest and its config file fluent.conf

 apiVersion: v1
 kind: ServiceAccount
 metadata:
   name: fluentd
   namespace: kube-system
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRole
 metadata:
   name: fluentd
 rules:
 - apiGroups:
   - ""
   resources:
   - pods
   - namespaces
   verbs:
   - get
   - list
   - watch
 ---
 kind: ClusterRoleBinding
 apiVersion: rbac.authorization.k8s.io/v1
 metadata:
   name: fluentd
 roleRef:
   kind: ClusterRole
   name: fluentd
   apiGroup: rbac.authorization.k8s.io
 subjects:
 - kind: ServiceAccount
   name: fluentd
   namespace: kube-system
 ---
 apiVersion: apps/v1
 kind: DaemonSet
 metadata:
   name: fluentd
   namespace: kube-system
   labels:
     k8s-app: fluentd-logging
     version: v1
 spec:
   selector:
     matchLabels:
       k8s-app: fluentd-logging
       version: v1
   template:
     metadata:
       labels:
         k8s-app: fluentd-logging
         version: v1
     spec:
       serviceAccount: fluentd
       serviceAccountName: fluentd
 #      tolerations:
 #      - key: node-role.kubernetes.io/control-plane
 #        effect: NoSchedule
 #      - key: node-role.kubernetes.io/master
 #        effect: NoSchedule
       containers:
       - name: fluentd
         image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
         imagePullPolicy: IfNotPresent
         env:
           - name:  FLUENT_ELASTICSEARCH_HOST
             value: "elastic.default.svc.cluster.local"
           - name:  FLUENT_ELASTICSEARCH_PORT
             value: "9200"
           - name: FLUENT_ELASTICSEARCH_SCHEME
             value: "https"
           - name: FLUENT_ELASTICSEARCH_SSL_VERIFY
             value: "false"
           - name: FLUENT_ELASTICSEARCH_SSL_VERSION
             value: "TLSv1_2"
           - name: FLUENT_KUBERNETES_METADATA_SKIP_LABELS
             value: "true"
           - name: FLUENT_KUBERNETES_METADATA_SKIP_CONTAINER_METADATA
             value: "true"
           - name: FLUENT_KUBERNETES_METADATA_SKIP_MASTER_URL
             value: "true"
           - name: FLUENT_KUBERNETES_METADATA_SKIP_NAMESPACE_METADATA
             value: "true"
           - name: FLUENT_CONTAINER_TAIL_PATH
             value: "/var/log/containers/some-*.log,/var/log/containers/some-other-*.log"
           - name: FLUENT_ELASTICSEARCH_USER
             valueFrom:
               secretKeyRef:
                 name: "elastic-user"
                 key: "username"
           - name: FLUENT_ELASTICSEARCH_PASSWORD
             valueFrom:
               secretKeyRef:
                 name: "elastic-user"
                 key: "password"
           - name: K8S_NODE_NAME
             valueFrom:
               fieldRef:
                 apiVersion: v1
                 fieldPath: spec.nodeName
         resources:
           limits:
             memory: 200Mi
           requests:
             cpu: 100m
             memory: 200Mi
         volumeMounts:
         - name: varlog
           mountPath: /var/log
         - name: dockercontainerlogdirectory
           mountPath: /var/log/pods
           readOnly: true
         - name: fluent-conf
           mountPath: /fluentd/etc/fluent.conf
           subPath: fluent.conf
       terminationGracePeriodSeconds: 30
       volumes:
       - name: varlog
         hostPath:
           path: /var/log
       - name: dockercontainerlogdirectory
         hostPath:
           path: /var/log/pods
       - name: fluent-conf
         configMap:
           name: fluent-conf

18. fluent.conf

  1. @include “#{ENV[‘FLUENTD_SYSTEMD_CONF’]
‘systemd’}.conf”
  1. @include “#{ENV[‘FLUENTD_PROMETHEUS_CONF’]
‘prometheus’}.conf”
  1. @include kubernetes.conf
  2. @include conf.d/*.conf
 <label @FLUENT_LOG>
   <match fluent.**>
     @type null
     @id ignore_fluent_logs
   </match>
 </label>
  
 <source>
   @type tail
   @id in_tail_container_logs
   path "#{ENV['FLUENT_CONTAINER_TAIL_PATH'] || '/var/log/containers/*.log'}"
   pos_file /var/log/fluentd-containers.log.pos
   tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}"
   exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}"
   read_from_head true
   follow_inodes true
   enable_watch_timer false
   <parse>
     @type cri
     merge_cri_fields true
   </parse>
 </source>
  
 <filter kubernetes.**>
   @type kubernetes_metadata
   @id filter_kube_metadata
   kubernetes_url "#{ENV['FLUENT_FILTER_KUBERNETES_URL'] || 'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
   verify_ssl "#{ENV['KUBERNETES_VERIFY_SSL'] || 'true'}"
   ca_file "#{ENV['KUBERNETES_CA_FILE']}"
   skip_labels "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_LABELS'] || 'false'}"
   skip_container_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_CONTAINER_METADATA'] || 'false'}"
   skip_master_url "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_MASTER_URL'] || 'false'}"
   skip_namespace_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_NAMESPACE_METADATA'] || 'false'}"
   watch "#{ENV['FLUENT_KUBERNETES_WATCH'] || 'true'}"
   stats_interval "#{'0'}"
 </filter>
  
 <match kubernetes.**>
    @type elasticsearch_data_stream
    data_stream_name "kubernetes-logs"
    data_stream_template_name "kubernetes-logging"
    @id out_es
    @log_level warn
    include_tag_key true
    host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
    port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
    path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}"
    scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
    ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
    ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1_2'}"
    user "#{ENV['FLUENT_ELASTICSEARCH_USER'] || use_default}"
    password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD'] || use_default}"
    reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'false'}"
    reconnect_on_error "#{ENV['FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR'] || 'true'}"
    reload_on_failure "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE'] || 'true'}"
    log_es_400_reason "#{ENV['FLUENT_ELASTICSEARCH_LOG_ES_400_REASON'] || 'true'}"
    type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}"
    include_timestamp "#{ENV['FLUENT_ELASTICSEARCH_INCLUDE_TIMESTAMP'] || 'true'}"
    sniffer_class_name "#{ENV['FLUENT_SNIFFER_CLASS_NAME'] || 'Fluent::Plugin::ElasticsearchSimpleSniffer'}"
    request_timeout "#{ENV['FLUENT_ELASTICSEARCH_REQUEST_TIMEOUT'] || '5s'}"
    application_name "#{ENV['FLUENT_ELASTICSEARCH_APPLICATION_NAME'] || use_default}"
    suppress_type_name "#{ENV['FLUENT_ELASTICSEARCH_SUPPRESS_TYPE_NAME'] || 'true'}"
    <buffer>
      flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}"
      flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}"
      chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '2M'}"
      queue_limit_length "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_QUEUE_LIMIT_LENGTH'] || '32'}"
      retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '30'}"
      retry_forever true
    </buffer>
 </match>

19. Deploy fluentd daemon set on worker nodes (uncomment tolerations if desired to run it in control plane)

 kubectl apply -f fluentd.yaml

20. In Kibana create a DataView matching “my-logs” alias or “kubernetes-logs” data stream.

21. Log out and log back in as viewer user. Verify that k8s logs are shown in Discovery menu.