Hi Zookeeper users
When the leader zookeeper instance is killed, theoretically, all followers
should restart and elect the new leader gracefully. But I am observing a
little frequently, all followers cannot start due to
"Failed to process transaction type: 1 error: KeeperErrorCode = NoNode
for..." "Unable to load database on disk"
So, I had to manually copy the file from the observer to each instance and
restart them.
What would be the root cause? Even though a few clients are heavily sending
the write requests including delete, create, setData, zookeeper shouldn't
make the corrupted data correct?
Is this disk IO performance or write cache problem? We're running zookeeper
on AWS EC2 environment, so we are using the same disk for snapshot
directory and log directory, also we didn't turn off disk write cache.
Thank you
Best, Jae
When the leader zookeeper instance is killed, theoretically, all followers
should restart and elect the new leader gracefully. But I am observing a
little frequently, all followers cannot start due to
"Failed to process transaction type: 1 error: KeeperErrorCode = NoNode
for..." "Unable to load database on disk"
So, I had to manually copy the file from the observer to each instance and
restart them.
What would be the root cause? Even though a few clients are heavily sending
the write requests including delete, create, setData, zookeeper shouldn't
make the corrupted data correct?
Is this disk IO performance or write cache problem? We're running zookeeper
on AWS EC2 environment, so we are using the same disk for snapshot
directory and log directory, also we didn't turn off disk write cache.
Thank you
Best, Jae