k8s cluster provides a convinient environment for elastic search cluster fronted with redundant kibana deployment and fed k8s logs from a fluentd daemon set. Here are just working stateful set manifest for elastic search cluster securely running on three worker nodes, deployment manifest with 2 replicas for kibana dashboard and daemonset manifest for fluentd collecting selected k8s logs from worker nodes.
1. First review the elasticsearch statefulset manifest and update as needed
apiVersion: v1 kind: Service metadata: name: elastic spec: type: "ClusterIP" clusterIP: None selector: app: elastic ports: - port: 9200 sessionAffinity: "ClientIP" sessionAffinityConfig: clientIP: timeoutSeconds: 3600 --- apiVersion: apps/v1 kind: StatefulSet metadata: name: elastic labels: app: elastic spec: spec: podManagementPolicy: OrderedReady replicas: 3 revisionHistoryLimit: 10 selector: matchLabels: app: elastic serviceName: elastic template: metadata: labels: app: elastic name: elastic spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - elastic topologyKey: kubernetes.io/hostname weight: 50 containers: - image: docker.elastic.co/elasticsearch/elasticsearch:8.4.1 imagePullPolicy: IfNotPresent name: elastic ports: - containerPort: 9300 name: elastic-cluster protocol: TCP - containerPort: 9200 name: elastic-http protocol: TCP resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /usr/share name: elastic - mountPath: /usr/share/elasticsearch/config/elasticsearch.yml name: elasticsearch-yml subPath: elasticsearch.yml - mountPath: /usr/share/elasticsearch/config/log4j2.properties name: log4j2-properties subPath: log4j2.properties - mountPath: /usr/share/elasticsearch/config/certs name: elastic-cert - mountPath: /usr/share/elasticsearch/config/jvm.options.d name: elastic-jvm-options - mountPath: /usr/share/elasticsearch/config/ca.pem name: trusted-ca subPath: internal_ca_chain.pem - mountPath: /usr/share/elasticsearch/jdk/conf/security/java.security name: java-security subPath: java.security dnsPolicy: ClusterFirst initContainers: - args: - sh - -c - if [ ! -d /mnt/elasticsearch ]; then cp -p -d -r /usr/share/elasticsearch /mnt/; touch /mnt/elasticsearch/config/ca.pem; mkdir /mnt/elasticsearch/config/certs; fi image: docker.elastic.co/elasticsearch/elasticsearch:8.4.1 imagePullPolicy: IfNotPresent name: elastic-cp resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /mnt name: elastic restartPolicy: Always schedulerName: default-scheduler securityContext: {} setHostnameAsFQDN: true subdomain: elastic terminationGracePeriodSeconds: 30 volumes: - configMap: defaultMode: 420 name: elasticsearch-yml name: elasticsearch-yml - name: elastic-cert secret: defaultMode: 420 secretName: elastic-cert - configMap: defaultMode: 420 name: internal-ca-chain name: trusted-ca - configMap: defaultMode: 420 name: elastic-log4j2-properties name: log4j2-properties - configMap: defaultMode: 420 name: elastic-java-security name: java-security - configMap: defaultMode: 420 name: elastic-jvm-options name: elastic-jvm-options updateStrategy: rollingUpdate: partition: 0 type: RollingUpdate volumeClaimTemplates: - apiVersion: v1 kind: PersistentVolumeClaim metadata: creationTimestamp: null name: elastic spec: accessModes: - ReadWriteOnce resources: limits: storage: 20Gi requests: storage: 10Gi storageClassName: managed-nfs-storage volumeMode: Filesystem
Note various volume mounts of configuration files and the initContainer, which copies elasticsearch files to an nfs volume for persistence. Also, the headless service for stateful set (ClusterIP: None)
2. Create necessary config maps populated from the edited configuration files: elasticsearch.yml, java.security, jvm.options, log4j2.properties and trusted CA certs. Create a tls secret for elastic search SSL certs. The problem with cert validation is that elastic search communicates using IPs and does not rely on reverse DNS lookups, hence it requires the certs to carry matching IP addresses, which is problematic in dynamic k8s environment. Therefore, it is necessary to use verification_mode: certificate in elasticseach.yml file, which will only validate that the cert is issued by trusted CA.
3. elasticsearch.yml
cluster.name: "es-cluster" network.host: "_eth0:ipv4_" node.store.allow_mmap: false discovery.type: multi-node discovery.seed_hosts: elastic-0.elastic.default.svc.cluster.local,elastic-1.elastic.default.svc.cluster.local,elastic-2.elastic.default.svc.cluster.local #uncomment for the first start, then comment out #cluster.initial_master_nodes: elastic-0.elastic.default.svc.cluster.local,elastic-1.elastic.default.svc.cluster.local,elastic-2.elastic.default.svc.cluster.local xpack.license.self_generated.type: basic xpack.security.autoconfiguration.enabled: false xpack.security.enabled: true xpack.security.enrollment.enabled: true xpack.security.http.ssl: enabled: true key: certs/tls.key certificate: certs/tls.crt certificate_authorities: ca.pem verification_mode: certificate xpack.security.transport.ssl: enabled: true key: certs/tls.key certificate: certs/tls.crt certificate_authorities: ca.pem verification_mode: certificate
4. jvm.options
-Xms4g -Xmx4g -XX:MaxDirectMemorySize=4g
5. Deploy elastic search
kubectl apply -f elastic.yaml
6. Reset password for “elastic” and “kibana_system” users in all pods
kubectl exec -ti elastic-0 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -b -u elastic kubectl exec -ti elastic-1 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u elastic kubectl exec -ti elastic-2 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u elastic kubectl exec -ti elastic-0 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -b -u kibana_system kubectl exec -ti elastic-1 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u kibana_system kubectl exec -ti elastic-2 -- /usr/share/elasticsearch/bin/elasticsearch-reset-password -i -u kibana_system
7. Create a secret elastic-user in kube-system namespace for use by fluentd
kubectl -n kube-system create secret generic elastic-user --from-literal username=elastic --from-literal password=****
8. Review deployment manifest and kibana.yml config file for kibana dashboard
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: annotations: ingress.kubernetes.io/ssl-passthrough: "true" name: kibana spec: rules: - host: kibana.example.com http: paths: - backend: service: name: kibana port: number: 443 pathType: ImplementationSpecific --- apiVersion: v1 kind: Service metadata: name: kibana spec: internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - port: 443 protocol: TCP targetPort: 5601 selector: app: kibana sessionAffinity: ClientIP sessionAffinityConfig: clientIP: timeoutSeconds: 3600 type: ClusterIP --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: kibana name: kibana namespace: default spec: progressDeadlineSeconds: 600 replicas: 2 revisionHistoryLimit: 10 selector: matchLabels: app: kibana strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: labels: app: kibana spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - kibana topologyKey: "kubernetes.io/hostname" weight: 50 containers: - image: docker.elastic.co/kibana/kibana:8.4.1 imagePullPolicy: IfNotPresent name: kibana ports: - containerPort: 5601 protocol: TCP resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /usr/share/kibana/config/kibana.yml name: kibana-yml subPath: kibana.yml - mountPath: /etc/ssl/certs name: kibana-cert - mountPath: /usr/share/kibana/config/ca.pem name: trusted-ca subPath: internal_ca_chain.pem dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 volumes: - name: kibana-yml secret: defaultMode: 420 secretName: kibana-yml - name: kibana-cert secret: defaultMode: 420 secretName: kibana-cert - configMap: defaultMode: 420 name: internal-ca-chain name: trusted-ca
9. Create a tls secret for kibana SSL cert - kibana-cert
10. Create kibana-yml secret since it contains a kibana_system user password from kibana.yml file
server.host: "0.0.0.0" server.shutdownTimeout: "5s" server.uuid: "bfdfcf1e-3ced-4e96-ab52-4dc218979aa3" server.name: "kibana.default.svc.cluster.local" server.publicBaseUrl: "https://kibana.example.com" server.ssl.enabled: true server.ssl.certificate: "/etc/ssl/certs/tls.crt" server.ssl.key: "/etc/ssl/certs/tls.key" server.ssl.certificateAuthorities: "/usr/share/kibana/config/ca.pem" elasticsearch.ssl.certificateAuthorities: "/usr/share/kibana/config/ca.pem" elasticsearch.hosts: [ "https://elastic-0.elastic.default.svc.cluster.local:9200", "https://elastic-1.elastic.default.svc.cluster.local:9200", "https://elastic-2.elastic.default.svc.cluster.local:9200" ] xpack.reporting.kibanaServer.hostname: "kibana.default.svc.cluster.local" elasticsearch.username: "kibana_system" elasticsearch.password: "*****" logging.root.level: "warn"
11. Deploy Kibana dashboard
kubectl apply -f kibana.yaml
12. Create a DNS record for kibana.example.com and log in Kibana https://kibana.example.com using elastic username and password (see above)
13. Create index lifecycle policy “kubernetes-logging” and index template “kubernetes-logging” using index pattern “kubernetes-logs*”. Enable Data Stream Assign the lifecycle policy to this index template in Settings
"index": { "lifecycle": { "name": "kubernetes-logging" } }
14. Assign an alias (I think this is optional with Data Stream) to the index template “kubernetes-logging”
"my-logs": {}
15. If needed, create an Ingest Pipeline and assign it to the index template as well under settings index.defaul_pipeline
16. Create a view user in kibana.
17. Review fluentd.yaml manifest and its config file fluent.conf
apiVersion: v1 kind: ServiceAccount metadata: name: fluentd namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: fluentd rules: - apiGroups: - "" resources: - pods - namespaces verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: fluentd roleRef: kind: ClusterRole name: fluentd apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: fluentd namespace: kube-system --- apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd namespace: kube-system labels: k8s-app: fluentd-logging version: v1 spec: selector: matchLabels: k8s-app: fluentd-logging version: v1 template: metadata: labels: k8s-app: fluentd-logging version: v1 spec: serviceAccount: fluentd serviceAccountName: fluentd # tolerations: # - key: node-role.kubernetes.io/control-plane # effect: NoSchedule # - key: node-role.kubernetes.io/master # effect: NoSchedule containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch imagePullPolicy: IfNotPresent env: - name: FLUENT_ELASTICSEARCH_HOST value: "elastic.default.svc.cluster.local" - name: FLUENT_ELASTICSEARCH_PORT value: "9200" - name: FLUENT_ELASTICSEARCH_SCHEME value: "https" - name: FLUENT_ELASTICSEARCH_SSL_VERIFY value: "false" - name: FLUENT_ELASTICSEARCH_SSL_VERSION value: "TLSv1_2" - name: FLUENT_KUBERNETES_METADATA_SKIP_LABELS value: "true" - name: FLUENT_KUBERNETES_METADATA_SKIP_CONTAINER_METADATA value: "true" - name: FLUENT_KUBERNETES_METADATA_SKIP_MASTER_URL value: "true" - name: FLUENT_KUBERNETES_METADATA_SKIP_NAMESPACE_METADATA value: "true" - name: FLUENT_CONTAINER_TAIL_PATH value: "/var/log/containers/some-*.log,/var/log/containers/some-other-*.log" - name: FLUENT_ELASTICSEARCH_USER valueFrom: secretKeyRef: name: "elastic-user" key: "username" - name: FLUENT_ELASTICSEARCH_PASSWORD valueFrom: secretKeyRef: name: "elastic-user" key: "password" - name: K8S_NODE_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi volumeMounts: - name: varlog mountPath: /var/log - name: dockercontainerlogdirectory mountPath: /var/log/pods readOnly: true - name: fluent-conf mountPath: /fluentd/etc/fluent.conf subPath: fluent.conf terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: dockercontainerlogdirectory hostPath: path: /var/log/pods - name: fluent-conf configMap: name: fluent-conf
18. fluent.conf
‘systemd’}.conf” |
‘prometheus’}.conf” |
<label @FLUENT_LOG> <match fluent.**> @type null @id ignore_fluent_logs </match> </label> <source> @type tail @id in_tail_container_logs path "#{ENV['FLUENT_CONTAINER_TAIL_PATH'] || '/var/log/containers/*.log'}" pos_file /var/log/fluentd-containers.log.pos tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}" exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}" read_from_head true follow_inodes true enable_watch_timer false <parse> @type cri merge_cri_fields true </parse> </source> <filter kubernetes.**> @type kubernetes_metadata @id filter_kube_metadata kubernetes_url "#{ENV['FLUENT_FILTER_KUBERNETES_URL'] || 'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}" verify_ssl "#{ENV['KUBERNETES_VERIFY_SSL'] || 'true'}" ca_file "#{ENV['KUBERNETES_CA_FILE']}" skip_labels "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_LABELS'] || 'false'}" skip_container_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_CONTAINER_METADATA'] || 'false'}" skip_master_url "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_MASTER_URL'] || 'false'}" skip_namespace_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_NAMESPACE_METADATA'] || 'false'}" watch "#{ENV['FLUENT_KUBERNETES_WATCH'] || 'true'}" stats_interval "#{'0'}" </filter> <match kubernetes.**> @type elasticsearch_data_stream data_stream_name "kubernetes-logs" data_stream_template_name "kubernetes-logging" @id out_es @log_level warn include_tag_key true host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}" port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}" path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}" scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}" ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}" ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1_2'}" user "#{ENV['FLUENT_ELASTICSEARCH_USER'] || use_default}" password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD'] || use_default}" reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'false'}" reconnect_on_error "#{ENV['FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR'] || 'true'}" reload_on_failure "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE'] || 'true'}" log_es_400_reason "#{ENV['FLUENT_ELASTICSEARCH_LOG_ES_400_REASON'] || 'true'}" type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}" include_timestamp "#{ENV['FLUENT_ELASTICSEARCH_INCLUDE_TIMESTAMP'] || 'true'}" sniffer_class_name "#{ENV['FLUENT_SNIFFER_CLASS_NAME'] || 'Fluent::Plugin::ElasticsearchSimpleSniffer'}" request_timeout "#{ENV['FLUENT_ELASTICSEARCH_REQUEST_TIMEOUT'] || '5s'}" application_name "#{ENV['FLUENT_ELASTICSEARCH_APPLICATION_NAME'] || use_default}" suppress_type_name "#{ENV['FLUENT_ELASTICSEARCH_SUPPRESS_TYPE_NAME'] || 'true'}" <buffer> flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}" flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}" chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '2M'}" queue_limit_length "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_QUEUE_LIMIT_LENGTH'] || '32'}" retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '30'}" retry_forever true </buffer> </match>
19. Deploy fluentd daemon set on worker nodes (uncomment tolerations if desired to run it in control plane)
kubectl apply -f fluentd.yaml
20. In Kibana create a DataView matching “my-logs” alias or “kubernetes-logs” data stream.
21. Log out and log back in as viewer user. Verify that k8s logs are shown in Discovery menu.