Spring Boot Fullstack Blockchain Application With Hyperledger Fabric running on Kubernetes (Part 5) — Kafka

8 min readDec 27, 2021

Hello everyone, through this article series we will the Hyperledger Fabric integration with Spring Boot.In this article, we will look into kafka.Also,I will also explain how to deploy Kafka on Kubernetes.

Other articles on Hyperledger Fabric integration with Spring Boot can be accessed from the links below.

Part 1 — Introduction

Part 2 — Kubernetes Cluster Setup

Part 3 — Fabric CA Server

Part 4 — Generating Certificates and Artifacts

Part 5— Kafka

Part 6 — Orderer

First of all, I will briefly explain Kafka.

Kafka

Apache Kafka is an open source framework for instant storage and analysis of big data, developed by LinkedIn and now part of Apache.

It uses the messaging system (queue) to quickly store and analyze big data.

Hyperledger Fabric ordering service nodes (OSNs) use your Kafka cluster and provide an ordering service to your blockchain network.

Let’s examine the Topic, Producer and Consumer concepts in Kafka.

Topic

Topics is a user-nameable area where data (messages) are stored. Topics are divided into Partitions and the number of Partitions they are stored in can be determined by the user.

Producer

Producers, on the other hand, are Publisher when we look at the Publish-Subscribe structure of Apache Kafka, which can send messages to these Topics. They can send data, ie messages, into the Topics and can be linked to more than one topic at the same time.

Consumer

Consumers, on the other hand, are Subscribers who consume the messages sent by the Producers to the Topics, as we can understand from the name. More than one Producer can send messages to a topic (Topic), and more than one Consumer can be included in a topic and read the data sent to the Topic. After a consumer reads these messages sent by the producers, this data is not deleted from the Topic.

Zookeeper

Apache ZooKeeper is an open source Apache project that allows clusters to distribute information such as configuration, naming, and group services over large clusters.

ZooKeeper uses a key-value store in a hierarchical fashion. Used for high availability environments. Apache ZooKeeper is written in Java and licensed under the Apache License 2.0. It is used by some big companies like Rackspace, Yahoo, eBay and Reddit.

Zookeeper keeps track of status of the Kafka cluster nodes and it also keeps track of Kafka topics, partitions etc.

Installation of Kafka on Kubernetes

For the asset transfer project, we will set up kafka and zookeeper to run 2 pods on kubernetes.

Let’s open the project we downloaded from this link and go to the directory where the k8s is located.

$ cd deploy/k8s

Kafka and Zookeeper Persistence Volume

First of all, let’s create a persistence volume for kafka and zookeper. The yaml files that create it are in the following directory in the project below.

deploy/k8s/pv/kafka-pv.yaml

deploy/k8s/pv/zookeeper-pv.yaml

For Kafka;

nfs:
  path: /srv/kubedata/fabricfiles/broker/kafka1
  server: 192.168.12.9

It specifies the path where the permanent data of Kafka is kept on the nfs server and the ip of the nfs server.

For Zookeper

nfs:
  path: /srv/kubedata/fabricfiles/broker/zookeeper0
  server: 192.168.12.9

It specifies the path where the permanent data of Zookoper is kept on the nfs server and the ip of the nfs server.

spec:
  storageClassName: default
  volumeMode: Filesystem
  capacity:
    storage: 1Gi

Storage capacity is limited and may vary depending on the node on which a pod runs: network-attached storage might not be accessible by all nodes, or storage is local to a node to begin with.It is assigned as 1gb.

metadata:
  name: kafka1-pv
  labels:
    app: kafka
    podindex: "0"

A podindex label was assigned to attach persistence volume with the relevant persistence volume claims.podindex: “0” represents the persistent data of the first kafka.

metadata:
  name: kafka1-pv
  labels:
    app: kafka
    podindex: "1"

podindex: “1” represents the persistent data of the second kafka.

metadata:
  name: zookeeper1-pv
  labels:
    app: zookeeper
    podindex: "0"

podindex: “0” represents the persistent data of the first zookeeper.podindex: “1” represents the persistent data of the second zookeeper.

Kafka and Zookeeper Persistence Volume Claim

Let’s create a persistence volume claim for kafka and zookeper. The yaml files that create it are in the following directory in the project below.

deploy/k8s/pvc/kafka-pvc.yaml

deploy/k8s/pvc/zookeeper-pvc.yaml

selector:
  matchLabels:
    app: kafka
    podindex: "0"

Claims can specify a label selector to further filter the set of volumes. Only the volumes whose labels match the selector can be bound to the claim.kafka persistence volume must have ‘podindex: “0”’ label.

selector:
  matchLabels:
    app: zookeeper
    podindex: "0"

Same way,zookeeper persistence volume must have podindex: “0” label.

resources:
  requests:
    storage: 1Gi

Claims, like Pods, can request specific quantities of a resource. In this case, the request is for storage.It is assigned as 1gb.

Kafka and Zookeeper StatefulSet

Let’s create a statefulset for kafka and zookeper. StatefulSet is the workload API object used to manage stateful applications.Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods.

The yaml files that create it are in the following directory in the project below.

deploy/k8s/kafka/kafka.yaml

deploy/k8s/kafka/zookeeper.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: broker
  labels:
    app: kafka

The StatefulSet, named broker.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: zoo
  labels:
    app: zookeeper

The StatefulSet, named zoo.

spec:
  selector:
    matchLabels:
      app: kafka
  serviceName: broker
  replicas: 2

The StatefulSet, has a Spec that indicates that 2 replicas of the kafka container will be launched in unique Pods.

selector:
  matchLabels:
    app: zookeeper
serviceName: zoo
replicas: 2

Same way,The StatefulSet, has a Spec that indicates that 2 replicas of the zookeeper container will be launched in unique Pods.

volumeClaimTemplates:
- metadata:
    name: kafka

The volumeClaimTemplates will provide stable storage using PersistentVolumes provisioned by a PersistentVolume Provisioner.It should be the same as the pvc metadata name.

volumeClaimTemplates:
- metadata:
    name: zookeeper

Same way,it should be the same as the pvc metadata name.

The following environment variables are assigned for Kafka.

env:
  - name: KAFKA_MESSAGE_MAX_BYTES
    value: "102760448"
  - name: KAFKA_REPLICA_FETCH_MAX_BYTES
    value: "102760448"
  - name: KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE
    value: "false"
  - name: KAFKA_DEFAULT_REPLICATION_FACTOR
    value: "2"
  - name: KAFKA_MIN_INSYNC_REPLICAS
    value: "2"
  - name: KAFKA_ZOOKEEPER_CONNECT
    value: zoo-0.zoo:2181,zoo-1.zoo:2181
  - name: KAFKA_PORT
    value: "9092"
  - name: GODEBUG
    value: netdns=go   
  - name: KAFKA_ZOOKEEPER_CONNECTION_TIMEOUT_MS
    value: "30000"
  - name: KAFKA_LOG_DIRS
    value: /opt/kafka/data

KAFKA_MESSAGE_MAX_BYTES: Maximum transmit message size.

KAFKA_REPLICA_FETCH_MAX_BYTES:Initial maximum number of bytes per topic+partition to request when fetching messages from the broker.

KAFKA_MIN_INSYNC_REPLICAS:is used when there is a problem in the topic, maybe one of the partitions is not in-sync, or offline. When this is the case the cluster will send an ack when KAFKA_MIN_INSYNC_REPLICAS is satisfied. So 2 replicas, with KAFKA_MIN_INSYNC_REPLICAS=2 will still be able to write.

KAFKA_ZOOKEEPER_CONNECT:Instructs Kafka how to get in touch with ZooKeeper.

zoo: zookeeper service name

2181:zookeeper service port

KAFKA_ZOOKEEPER_CONNECTION_TIMEOUT_MS:The max time that the client waits while establishing a connection to zookeeper

KAFKA_DEFAULT_REPLICATION_FACTOR:The replication factor for the tier metadata topic (set higher to ensure availability).

The following environment variables are assigned for Zookeeper.

env:
  - name: ZOO_SERVERS
    value: server.0=zoo-0.zoo.default.svc.cluster.local:2888:3888 server.1=zoo-1.zoo.default.svc.cluster.local:2888:3888
  - name: ZOO_4LW_COMMANDS_WHITELIST
    value: srvr, mntr, ruok
  - name: ZOO_MAX_SESSION_TIMEOUT
    value: "40000"
  - name: ZOO_TICK_TIME
    value: "2000"

ZOO_SERVERS: This variable allows you to specify a list of machines of the Zookeeper ensemble. Each entry should be specified as such: server.id=<address1>:<port1>:<port2>

ZOO_4LW_COMMANDS_WHITELIST:A list of comma separated Four Letter Words commands that user wants to use. A valid Four Letter Words command must be put in this list else ZooKeeper server will not enable the command.Defaults to srvr

ZOO_MAX_SESSION_TIMEOUT:Maximum session timeout in milliseconds that the server will allow the client to negotiate

ZOO_TICK_TIME:Basic time unit in milliseconds used by Apache ZooKeeper for heartbeats

ports:
  - name: broker
    containerPort: 9092

Kafka listens one port: 9092 is the default port used by Kafka.

ports:
  - name: client
    containerPort: 2181
  - name: peer
    containerPort: 2888
  - name: leader-election
    containerPort: 3888

Zookeeper listens on three ports: 2181 for client connections; 2888 for follower connections, if they are the leader; and 3888 for other server connections during the leader election phase .

Kafka and Zookeeper Service

Let’s create a service for kafka and zookeper.

The yaml files that create it are in the following directory in the project below.

deploy/k8s/kafka/kafka-svc.yaml

deploy/k8s/kafka/zookeeper-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: kafka
  labels:
    app: kafka

This specification creates a new Service object named “kafka”.

apiVersion: v1
kind: Service
metadata:
  name: broker

This specification creates a new headless Service object named “broker”.A headless service is a service with a service IP but instead of load-balancing it will return the IPs of our associated Pods. This allows us to interact directly with the Pods instead of a proxy.Required for configuring the Kafka in cluster mode.

ports:
- name: "broker"
  targetPort: 9092
  port: 9092

targetPort: container port.The default port used by Kafka is 9092.

port: kubernetes service port.

apiVersion: v1
kind: Service
metadata:
  name: zookeeper
  labels:
    app: zookeeper

This specification creates a new Service object named “zookeeper”.

apiVersion: v1
kind: Service
metadata:
  name: zoo
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: zookeeper

This specification creates a new headless Service object named “zoo”.Required for configuring the Zookeeper in cluster mode.

ports:
- name: "peer"
  targetPort: 2888 
  port: 2888
- name: "leader-election"
  targetPort: 3888 
  port: 3888

The zookeeper’s peer and leader election default ports are exposed via headless service.

ports:
  - name: client
    protocol: TCP
    targetPort: 2181
    port: 2181

The zookeeper’s client default port are exposed via zookeeper service.

Deploy Kafka and Zookeeper on Kubernetes

Let’s connect to the kubernetes master node virtual machine with the vagrant ssh command.

$ vagrant ssh k8smaster

Let’s go to the directory where the kubernetes installation scripts are located.This directory is the same as the deploy/k8s folder in the project. With Vagrant, this directory is synchronized to the virtual machine.

$ cd /vagrant/k8s

Deploying the persistence volume for zookeeper and kafka.

$ kubectl apply -f pv/kafka-pv.yaml$ kubectl apply -f pv/zookeeper-pv.yaml

After the pv is created, let’s create the pvc.

Deploying the persistence volume claim for zookeeper and kafka.

$ kubectl apply -f pvc/kafka-pvc.yaml$ kubectl apply -f pvc/zookeeper-pvc.yaml

Deploying the statefulset for zookeeper and kafka.

$ kubectl apply -f kafka/

Kafka and zookeeper creation pending completion.

$ kubectl wait --for condition=ready --timeout=300s pod -l "app in (zookeeper,kafka)"

Kafka and zookeeper created successfully.

Finally, let’s check the conditions of the pods we run from the lens ide.

The state of Kafka and zookeeper pods appear to be running.

My article ends here. In general,I introduced Kafka and Zookeeper and explained the deployment of these tools on Kubernetes.

See you in the next articles.

Project Links

Spring Boot Hlf Starter Project details and installation can be accessed via this link.

Asset Transfer Project details and installation can be accessed via this link