When leader is shut down, a new leader is not elected

Hi,
I’ve installed a 3-node cluster for testing prupose
Works fine until I reboot the leader node.
During reboot, only followers are left, and no new leader gets elected.
why that ?
According to High availability | YugabyteDB Docs, after a few seconds, one of the followers should become a leader.

what’s wrong ?
Thanks

Hi @fabrice.nizet

How did you deploy the cluster?
How are you checking that only followers are left?
Can you paste a screenshot of “http://yb-master-ip:7000/” and “http://yb-master-ip:7000/tablet-servers” a couple of minutes after 1 of the servers is down?

Hi @dorian_yugabyte,
I deployed the cluster following instructions on YB website : deployed 3 VM, installed YB from archive, configured master.conf and tserver.conf, started master and tserver on all 3 nodes using these config files
I check status using yb-admin list_all_masters and it says only 2 nodes are left and there are FOLLOWER
I can paste a screenshot, but that’ll be of use, since que web UI says “can’t find leader to route request”

Can you paste the screenshots?
Are there any logs spewed by the remaining masters during this time? (how to get logs)

Steps to reproduce :

  • SSH to every node
  • Check that processes are up (ps | grep yb) => OK
    Node 1 :
  • stop yb-master
    Node 2 :
    $ /yugabyte-2.7.1.1/bin/yb-admin --flagfile /yugabyte-2.7.1.1/master.conf list_all_masters
    Timed out (yb/rpc/rpc.cc:211): Unable to establish connection to leader master at [wqyb0311:7100,wqyb0312:7100,wqyb0313:7100]. Please verify the addresses.

: Could not locate the leader master: GetLeaderMasterRpc(addrs: [wqyb0311:7100, wqyb0312:7100, wqyb0313:7100], num_attempts: 2) passed its deadline 1885727.603s (passed: 0.019s): Network error (yb/util/net/socket.cc:537): recvmsg error: Connection refused (system error 111)

Same behavior on node 3

Node 1 :

  • restart yb-master
    $ /yugabyte-2.7.1.1/bin/yb-admin --flagfile /yugabyte-2.7.1.1/master.conf list_all_masters
    Master UUID RPC Host/Port State Role
    644ba5e5827444cea0bd372f6378355d wqyb0311:7100 ALIVE FOLLOWER
    a78ec444dde64b978a827f1560264c2c wqyb0313:7100 ALIVE LEADER
    b035ca4cf90140d49d1054403142a565 wqyb0312:7100 ALIVE FOLLOWER

=> node 3 is the new leader

Reboot node 3
Same behavior!
screen1|690x349

What about any logs from the reamining yb-master processes after the leader is down?