Producer fails when old brokers are replaced by new

Hi all,

I ran into a problem with the Kafka producer when attempting to replace all
the nodes in a 0.8.0 Beta1 Release Kafka cluster, with 0.8.0 Release nodes.
I started a producer/consumer test program to measure the clusters
performance during the process, I added new brokers, I ran
kafka-reassign-partitions.sh, and I removed the old brokers. When I removed
the old brokers my producer failed.

The simplest scenario that I could come up with where I still see this
behavior is this. Using version 0.8.0 Release, we have a 1 partition topic
with 2 replicas on 2 brokers, broker A and broker B. Broker A is taken
down. A producer is started with only broker B in the metadata.broker.list.
Broker A is brought back up. We let
topic.metadata.refresh.interval.msamount of time pass. Broker B is
taken down, and we get
kafka.common.FailedToSendMessageException after all the (many) retries have
failed.

During my experimentation I have made sure that the producer fetches meta
data before the old broker is taken down. And I have made sure that enough
retries with enough backoff time were used for the producer to not give up
prematurely.

The documentation for the producer config metadata.broker.list suggests to
me that this list of brokers is only used at startup. "This is for
bootstrapping and the producer will only use it for getting metadata
(topics, partitions and replicas)". And when I read about
topic.metadata.refresh.interval.ms and retry.backoff.ms I learn that meta
data is indeed fetched at later times. Based on this documentation, I make
the assumption that the producer would learn about any new brokers when new
meta data is fetched.

I also want to point out that the cluster seems to work just fine during
this process, it only seems to be a problem with the producer. Between all
these steps I run kafka-list-topic.sh, I try the console producer and
consumer, and everything is as expected.

Also I found another interesting thing when experimenting with running
kafka-preferred-replica-election.sh before taking down the old broker. This
script only causes any changes when the leader and the preferred replica
are different. In the scenario when they are in fact different, and the new
broker takes the role of leader from the old broker, the producer does NOT
fail. This makes me think that perhaps the producer only keeps meta data
about topic leaders and not all replicas, as the documentation suggests to
me.

It is clear that I am making a lot of assumptions here, and I am relatively
new to Kafka so I could very well me missing something important. The way I
see it, there are a few possibilities.

1. Broker discovery is a supposed producer feature, and it has a bug.
2. Broker discovery is not a producer feature, in which case I think many
people might benefit from a clearer documentation.
3. I am doing something dumb e.g. forgetting about some important
configuration.

Please let me know what you make of this.

Thanks,
Christofer Hedbrandh

Producer fails when old brokers are replaced by new

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112