Encrypt Communication Between Kafka And ZooKeeper With TLS
- Download and unpack zookeeper+tls.tgz.
- Run README.sh for a fully automated example of the presented setup.
Copy and paste to execute the two steps on Linux:
A german translation of this article can be found on http://trion.de.
curl -sc - https://juplo.de/wp-uploads/zookeeper+tls.tgz | tar -xzv && cd zookeeper+tls && ./README.sh
Current Kafka Cannot Encrypt ZooKeeper-Communication
Up until now (Version 2.3.0 of Apache Kafka) it is not possible, to encrypt the communication between the Kafka-Brokers and their ZooKeeper-ensemble. This is not possiible, because ZooKeeper 3.4.13, which is shipped with Apache Kafka 2.3.0, lacks support for TLS-encryption.
The documentation deemphasizes this, with the observation, that usually only non-sensitive data (configuration-data and status information) is stored in ZooKeeper and that it would not matter, if this data is world-readable, as long as it can be protected against manipulation, which can be done through proper authentication and ACL’s for zNodes:
The rationale behind this decision is that the data stored in ZooKeeper is not sensitive, but inappropriate manipulation of znodes can cause cluster disruption. (Kafka-Documentation)
This quote obfuscates the elsewhere mentioned fact, that there are use-cases that store sensible data in ZooKeeper. For example, if authentication via SASL/SCRAM or Delegation Tokens is used. Accordingly, the documentation often stresses, that usually there is no need to make ZooKeeper accessible to normal clients. Nowadays, only admin-tools need direct access to the ZooKeeper-ensemble. Hence, it is stated as a best practice, to make the ensemble only available on a local network, hidden behind a firewall or such.
In cleartext: One must not run a Kafka-Cluster, that spans more than one data-center — or at least make sure, that all communication is tunneled through a virtual private network.
ZooKeeper 3.5.5 To The Rescue
On may the 20th 2019, version 3.5.5 of ZooKeeper has been released. Version 3.5.5 is the first stable release of the 3.5.x branch, that introduces the support for TLS-encryption, the community has yearned for so long. It supports the encryption of all communication between the nodes of a ZooKeeper-ensemble and between ZooKeeper-Servers and -Clients.
Part of ZooKeeper is a sophisticated client-API, that provide a convenient abstraction for the communication between clients and servers over the Atomic Broadcast Protocol. The TLS-encryption is applied by this API transparently. Because of that, all client-implementations can profit from this new feature through a simple library-upgrade from 3.4.13 to 3.5.5. This article will walk you through an example, that shows how to carry out such a library-upgrade for Apache Kafka 2.3.0 and configure a cluster to use TLS-encryption, when communicating with a standalone ZooKeeper.
The presented setup is ment for evaluation only!
It fiddles with the libraries, used by Kafka, which might cause unforseen issues.
Furthermore, using TLS-encryption in ZooKeeper requires one to switch from the battle-tested
NIOServerCnxnFactory, which uses the NIO-API directly, to the newly introduced
NettyServerCnxnFactory, which is build on top of Netty.
Recipe To Enable TLS Between Broker And ZooKeeper
The article will walk you step by step through the setup now. If you just want to evaluate the example, you can jump to the download-links.
All commands must be executed in the same directory. We recommend, to create a new directory for that purpose.
Download Kafka and ZooKeeper
First of all: Download version 2.3.0 of Apache Kafka and version 3.5.5 of Apache ZooKeeper:
curl -sc - http://ftp.fau.de/apache/zookeeper/zookeeper-3.5.5/apache-zookeeper-3.5.5-bin.tar.gz | tar -xzv curl -sc - http://ftp.fau.de/apache/kafka/2.3.0/kafka_2.12-2.3.0.tgz | tar -xzv
Switch Kafka 2.3.0 from ZooKeeper 3.4.13 to ZooKeeper 3.5.5
Remove the 3.4.13-version from the
libs-directory of Apache Kafka:
rm -v kafka_2.12-2.3.0/libs/zookeeper-3.4.14.jar
Then copy the JAR’s of the new version of Apache ZooKeeper into that directory. (The last JAR is only needed for CLI-clients, like for example
cp -av apache-zookeeper-3.5.5-bin/lib/zookeeper-3.5.5.jar kafka_2.12-2.3.0/libs/ cp -av apache-zookeeper-3.5.5-bin/lib/zookeeper-jute-3.5.5.jar kafka_2.12-2.3.0/libs/ cp -av apache-zookeeper-3.5.5-bin/lib/netty-all-4.1.29.Final.jar kafka_2.12-2.3.0/libs/ cp -av apache-zookeeper-3.5.5-bin/lib/commons-cli-1.2.jar kafka_2.12-2.3.0/libs/
That is all there is to do to upgrade ZooKeeper. If you run one of the Kafka-commands, it will use ZooKeeper 3.5.5. from now on.
Create A Private CA And The Needed Certificates
Create the root-certificate for the CA and store it in a Java-truststore:
openssl req -new -x509 -days 365 -keyout ca-key -out ca-cert -subj "/C=DE/ST=NRW/L=MS/O=juplo/OU=kafka/CN=Root-CA" -passout pass:superconfidential keytool -keystore truststore.jks -storepass confidential -import -alias ca-root -file ca-cert -noprompt
The following commands will create a self-signed certificate in
What happens is:
- Create a new key-pair and certificate for
- Generate a certificate-signing-request for that certificate
- Sign the request with the key of private CA and also add a SAN-extension, so that the signed certificate is also valid for
- Import the root-certificate of the private CA into the keystore
- Import the signed certificate for
zookeeperinto the keystore
NAME=zookeeper keytool -keystore $NAME.jks -storepass confidential -alias $NAME -validity 365 -genkey -keypass confidential -dname "CN=$NAME,OU=kafka,O=juplo,L=MS,ST=NRW,C=DE" keytool -keystore $NAME.jks -storepass confidential -alias $NAME -certreq -file cert-file openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file -out $NAME.pem -days 365 -CAcreateserial -passin pass:superconfidential -extensions SAN -extfile <(printf "\n[SAN]\nsubjectAltName=DNS:$NAME,DNS:localhost") keytool -keystore $NAME.jks -storepass confidential -import -alias ca-root -file ca-cert -noprompt keytool -keystore $NAME.jks -storepass confidential -import -alias $NAME -file $NAME.pem
Repeat this with:
Now we have signed certificates for all participants in our small example, that are stored in separate keystores, each with a Chain-of-Trust set up, that is rooting in our private CA. We also have a truststore, that will validate all these certificates, because it contains the root-certificate of the Chain-of-Trust: the certificate of our private CA.
Configure And Start ZooKeeper
We hightlight/explain only the configuration-options here, that are needed for TLS-encryption!
In our setup, the standalone ZooKeeper essentially needs two specially tweaked configuration files, to use encryption.
Create the file
SERVER_JVMFLAGS="-Xms512m -Xmx512m -Dzookeeper.serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory" ZOO_LOG_DIR=.
- The Java-Environmentvariable
zookeeper.serverCnxnFactoryswitches the connection-factory to use the Netty-Framework.
Without this, TLS is not possible!
Create the file
dataDir=/tmp/zookeeper secureClientPort=2182 maxClientCnxns=0 authProvider.1=org.apache.zookeeper.server.auth.X509AuthenticationProvider ssl.keyStore.location=zookeeper.jks ssl.keyStore.password=confidential ssl.trustStore.location=truststore.jks ssl.trustStore.password=confidential
secureClientPort: We only allow encrypted connections!
(If we want to allow unencrypted connections too, we can just specify
authProvider.1: Selects authentification through client certificates
ssl.keyStore.*: Specifies the path to and password of the keystore, with the
ssl.trustStore.*: Specifies the path to and password of the common truststore with the root-certificate of our private CA
Copy the file
log4j.properties into the current working directory, to enable logging for ZooKeeper (see also
cp -av apache-zookeeper-3.5.5-bin/conf/log4j.properties .
Start the ZooKeeper-Server:
apache-zookeeper-3.5.5-bin/bin/zkServer.sh --config . start
--config .: The script should search in the current directory for the configration data and certificates.
Konfigure And Start The Brokers
We hightlight/explain only the configuration-options and start-parameters here, that are needed to encrypt the communication between the Kafka-Brokers and the ZooKeeper-Server!
The other parameters shown here, that are concerned with SSL are only needed for securing the communication between the Brokers itself and between Brokers and Clients. You can read all about them in the standard documentation. In short: This example is set up, to use SSL for authentication between the brokers and SASL/PLAIN for client-authentification — both channels are encrypted with TLS.
TLS for the ZooKeeper Client-API is configured through Java-Environmentvariables. Hence, most of the SSL-configuration for connecting to ZooKeeper has to be specified, when starting the broker. Only the address and port for the connction itself is specified in the configuration-file.
Create the file
broker.id=1 zookeeper.connect=zookeeper:2182 listeners=SSL://kafka-1:9193,SASL_SSL://kafka-1:9194 security.inter.broker.protocol=SSL ssl.client.auth=required ssl.keystore.location=kafka-1.jks ssl.keystore.password=confidential ssl.key.password=confidential ssl.truststore.location=truststore.jks ssl.truststore.password=confidential listener.name.sasl_ssl.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required user_consumer="pw4consumer" user_producer="pw4producer"; sasl.enabled.mechanisms=PLAIN log.dirs=/tmp/kafka-1-logs offsets.topic.replication.factor=2 transaction.state.log.replication.factor=2 transaction.state.log.min.isr=2
zookeeper.connect: If you allow unsecure connections too, be sure to specify the right port here!
- All other options are not relevant for encrypting the connections to ZooKeeper
Start the broker in the background and remember its PID in the file
( export KAFKA_OPTS=" -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzookeeper.client.secure=true -Dzookeeper.ssl.keyStore.location=kafka-1.jks -Dzookeeper.ssl.keyStore.password=confidential -Dzookeeper.ssl.trustStore.location=truststore.jks -Dzookeeper.ssl.trustStore.password=confidential " kafka_2.12-2.3.0/bin/kafka-server-start.sh kafka-1.properties & echo $! > KAFKA-1 ) > kafka-1.log &
Check the logfile
kafka-1.log to confirm that the broker starts without errors!
zookeeper.clientCnxnSocket: Switches from NIO to the Netty-Framework.
Without this, the ZooKeeper Client-API (just like the ZooKeeper-Server) cannot use TLS!
zookeeper.client.secure=true: Switches on TLS-encryption, for all connections to any ZooKeeper-Server
zookeeper.ssl.keyStore.*: Specifies the path to and password of the keystore, with the
zookeeper.ssl.trustStore.*: Specifies the path to and password of the common truststore with the root-certificate of our private CA
Do the same for
And do not forget, to adapt the config-file accordingly — or better: just download a copy...
Configure And Execute The CLI-Clients
All scripts from the Apache-Kafka-Distribution that connect to ZooKeeper are configured in the same way as seen for
For example, to create a topic, you will run:
export KAFKA_OPTS=" -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzookeeper.client.secure=true -Dzookeeper.ssl.keyStore.location=client.jks -Dzookeeper.ssl.keyStore.password=confidential -Dzookeeper.ssl.trustStore.location=truststore.jks -Dzookeeper.ssl.trustStore.password=confidential " kafka_2.12-2.3.0/bin/kafka-topics.sh \ --zookeeper zookeeper:2182 \ --create --topic test \ --partitions 1 --replication-factor 2
Note: A different keystore is used here (
CLI-clients, that connect to the brokers, can be called as usual.
In this example, they use an encrypted listener on port 9194 (for
kafka-1) and are authenticated using SASL/PLAIN.
The client-configuration is kept in the files
Take a look at that files and compare them with the broker-configuration above.
If you want to lern more about securing broker/client-communication, we refere you to the official documentation.
If you have trouble to start these clients, download the scripts and take a look at the examples in README.sh
TBD: Further Steps To Take...
This recipe only activates TLS-encryption between Kafka-Brokers and a Standalone ZooKeeper. It does not show, how to enable TLS between ZooKeeper-Nodes (which should be easy) or if it is possible to authenticate Kafka-Brokers via TLS-certificates. These topics will be covered in future articles...
Fully Automated Example Of The Presented Setup
Download and unpack zookeeper+tls.tgz for an evaluation of the presented setup:
curl -sc - https://juplo.de/wp-uploads/zookeeper+tls.tgz | tar -xzv
The archive contains a fully automated example. Just run README.sh in the unpacked directory.
It downloads the required software, carries out the library-upgrade, creates the required certificates and starts a standalone ZooKeeper and two Kafka-Brokers, that use TLS to encrypt all communication. It also executes a console-consumer and a console-producer, that read and write to a topic, and a zookeeper-shell, that communicates directly with the ZooKeeper-node, to proof, that the setup is working. The ZooKeeper and the Brokers-instances are left running, to enable the evaluation of the fully encrypted cluster.
README.sh, to execute the automated example
- After running
README.sh, the Kafka-Cluster will be still running, so that one can experiment with commands from
README.shcan be executed repeatedly: it will skip all setup-steps, that are already done automatically
README.sh stop, to stop the Kafka-Cluster (it can be restarted by re-running
README.sh cleanup, to stop the Cluster and remove all created files and data (only the downloaded packages will be left untouched)