Confluent Kafka: Clustered Deployment on Docker for Windows

I've recently had the opportunity to tinker with Kafka. What a great tool it is! There's such potential there and I'm only getting started with it. I can't wait to see how we use it at work!

One issue that I'm running into with the documentation and tutorials I'm following is that they're often written for a Unix system but the company I work for is a Windows shop. Fortunately we have Docker, but many of the commands in the tutorials are written assuming a shell like Bash and not Powershell. This has led me to some frustration and some head scratching mainly because I'm much more comfortable with Bash than I am with Powershell. I bounced around to a few tutorials and different ways of implementing Kafka before landing on Confluent's platform. I don't know if I'll use this one long-term, but I wanted to share the adaptations I needed to make to the tutorial I followed.

The tutorial I followed is called "Clustered Deployment on Docker." Before you get started there, make sure to install Docker for Windows and have Hyper-V enabled. Once you have that done, there's one more setup item you need to take care of. You'll need to set up a new external network switch which you can do following the instructions in the Docker Docs about Hyper-V. With that, you're now ready to start the tutorial. I won't be reposting the tutorial here. Rather, I'll be referring to the steps in the tutorial and posting the changes I needed to make in this article.

Docker for Windows Client: Setting Up a Three Node Kafka Cluster

1.

The first thing the tutorial instructs you to do is "create and configure the Docker machine." To do this, it provides the following command:

docker-machine create --driver virtualbox --virtualbox-memory 6000 confluent

Because you're using Docker for Windows, you have Hyper-V enabled. Unfortunately VirtualBox cannot run when Hyper-V is turned on. So, if you want to use VirtualBox, you need to turn off Hyper-V... but then Docker won't work which means you won't be able to create this machine unless you turn Hyper-V on. But then you can't create the machine without VirtualBox working... And back and forth you go.

Fortunately you can create machines with Hyper-V instead of VirtualBox! Here's the command I used instead. If you're curious about the details here, it's worth checking out the docs.

docker-machine create --driver hyperv --hyperv-memory 6000 confluent

After creating the machine, the tutorial instructs you to "configure your terminal window to attach it to your new Docker Machine" with the following command:

eval $(docker-machine env confluent)

Here's the Powershell way of doing that:

Invoke-Expression -Command "docker-machine env confluent"

Note: the shell may instruct you to run a similar command to finish setting this up. For me, it was something like this:

& "C:\Path\to\Docker\Docker\Resources\bin\docker-machine.exe" env confluent | Invoke-Expression

2.

Next up is to "Start Up a 3-node ZooKeeper Ensemble." They provide three versions of the docker run -d command. Instead of those three commands, use these:

docker run -d --net=host --name=zk-1 -e ZOOKEEPER_SERVER_ID=1 -e ZOOKEEPER_CLIENT_PORT=22181 -e ZOOKEEPER_TICK_TIME=2000 -e ZOOKEEPER_INIT_LIMIT=5 -e ZOOKEEPER_SYNC_LIMIT=2 -e ZOOKEEPER_SERVERS="localhost:22888:23888;localhost:32888:33888;localhost:42888:43888" confluentinc/cp-zookeeper:5.0.0
docker run -d --net=host --name=zk-2 -e ZOOKEEPER_SERVER_ID=2 -e ZOOKEEPER_CLIENT_PORT=32181 -e ZOOKEEPER_TICK_TIME=2000 -e ZOOKEEPER_INIT_LIMIT=5 -e ZOOKEEPER_SYNC_LIMIT=2 -e ZOOKEEPER_SERVERS="localhost:22888:23888;localhost:32888:33888;localhost:42888:43888" confluentinc/cp-zookeeper:5.0.0
docker run -d --net=host --name=zk-3 -e ZOOKEEPER_SERVER_ID=3 -e ZOOKEEPER_CLIENT_PORT=42181 -e ZOOKEEPER_TICK_TIME=2000 -e ZOOKEEPER_INIT_LIMIT=5 -e ZOOKEEPER_SYNC_LIMIT=2 -e ZOOKEEPER_SERVERS="localhost:22888:23888;localhost:32888:33888;localhost:42888:43888" confluentinc/cp-zookeeper:5.0.0

Next up is to check that the brokers have started up successfully. To do that, try checking the logs of one of them. For this one, you don't need to make any changes to the command they provide:

docker logs zk-1

To check the other two nodes, simply replace the number "1" with 2 or 3.

Now we need to ensure the ZK ensemble is ready. The tutorial provides a simple for loop. Unfortunately Powershell doesn't use the same syntax so we need to change this one. Here's the for loop I used:

foreach ($i in 22181, 32181, 42181) {
 docker run --net=host --rm confluentinc/cp-zookeeper:5.0.0 bash -c "echo stat | nc localhost $i | grep Mode"
}

You should see one leader and two followers. If that's what you have, you've done everything correctly so far!

3.

Next we'll be spinning up a 3-node Kafka cluster. The tutorial again provides 3 docker run commands. Here are the three I used. The only difference is keeping them each on their own line instead of using line breaks. In fact, this is the only change I made to the rest of the commands in the tutorial.

docker run -d --net=host --name=kafka-1 -e KAFKA_ZOOKEEPER_CONNECT=localhost:22181,localhost:32181,localhost:42181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:29092 confluentinc/cp-kafka:5.0.0
docker run -d --net=host --name=kafka-2 -e KAFKA_ZOOKEEPER_CONNECT=localhost:22181,localhost:32181,localhost:42181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:39092 confluentinc/cp-kafka:5.0.0
docker run -d --net=host --name=kafka-3 -e KAFKA_ZOOKEEPER_CONNECT=localhost:22181,localhost:32181,localhost:42181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:49092 confluentinc/cp-kafka:5.0.0

You can again check that these ones are working by checking the logs (you can replace the 1 with 2 and 3 again as well):

docker logs kafka-1

4.

Next up we'll be checking that the brokers are working as expected by creating a topic. Here's the command I used for this:

docker run --net=host --rm confluentinc/cp-kafka:5.0.0 kafka-topics --create --topic bar --partitions 3 --replication-factor 3 --if-not-exists --zookeeper localhost:32181

Now let's verify the topic was created successfully be describing the topic:

docker run --net=host --rm confluentinc/cp-kafka:5.0.0 kafka-topics --describe --topic bar --zookeeper localhost:32181

Now that we have a topic, let's put some content in it:

docker run --net=host --rm confluentinc/cp-kafka:5.0.0 bash -c "seq 42 | kafka-console-producer --broker-list localhost:29092 --topic bar && echo 'Produced 42 messages.'"

Finally, let's read the data from that topic:

docker run --net=host --rm confluentinc/cp-kafka:5.0.0 kafka-console-consumer --bootstrap-server localhost:29092 --topic bar --from-beginning --max-messages 42

And there you have it! You've now done their tutorial in Powershell rather than a Unix shell. The biggest differences came in the first few commands. After that, the commands were nearly identical except for removing the line breaks. I spent longer than I should have with the whole VirtualBox/Hyper-V catch 22. Once I realized I could use Hyper-V instead, things went pretty quickly. Hopefully this write up helps you avoid the same problems and helps you get up and running on Windows.

Comments

Popular posts from this blog

A Common Technical Lead Pitfall

Maze Generation in JavaScript

Leadership Experiment Update 2