Port 7000 not open, UI connection time out

Hi, I have manually set up a three node yugabyte cluster that on three docker containers and everything works fine. But suddenly the UI is no longer accessible and so is the port 7000. The nodes are able to connect with each other and everything else seems to work fine. I do have ERROR logs showing in my tserver logs but I don’t think this shows me anything because the time in the log indicates that the error happened when the nodes are created. And it makes sense that the nodes are not all up at the same time but the docker containers are created sequentially. But just in case, my error log is:

Log file created at: 2019/10/04 17:25:31
Running on machine: localhost.localdomain
Application fingerprint: version 2.0.0.0 build 16 revision 0b543e8ec9f16ae989ba29873e2d1d7977551b23 build_type RELEASE built at 14 Sep 2019 18:43:22 UTC
Running duration (h:mm:ss): 0:00:02
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E1004 17:25:31.947553  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:31.947891  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:31.999066  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:35.458593  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:35.458647  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:37.964661  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:40.974356  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:40.974361  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:41.473245  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:44.481426  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:44.481449  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:46.991566  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:49.998819  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:49.998823  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:50.492696  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:53.500528  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:53.500576  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:25:56.014961  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:59.019337  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:59.021250  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:25:59.519343  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:02.521001  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:02.525748  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:05.030326  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:08.042172  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:08.043351  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:08.532222  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:11.545009  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:11.545018  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:14.059631  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:17.065300  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:17.065323  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:17.568434  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:20.567822  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:20.567883  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:23.080592  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:26.088017  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:26.089774  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:26.585371  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:29.590523  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:29.591670  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:32.104843  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:35.110388  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:35.110456  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:38.116392  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:38.612962  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:38.613051  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:41.620666  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.71:7100 timed out after 2.500s
E1004 17:26:44.124897  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:44.128276  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:47.134356  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:47.629350  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:47.631616  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:50.636878  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:53.141919  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:53.144878  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:56.153575  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:59.153812  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:26:59.154757  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:27:02.172544  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:27:02.656159  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:27:02.657336  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:27:08.171052  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:27:08.172397  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:27:08.174511  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/outbound_call.cc:510): Could not locate the leader master: GetMasterRegistration RPC to 172.25.1.72:7100 timed out after 2.500s
E1004 17:27:14.202728  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 97) passed its deadline 319813.896s (passed: 5.027s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:14.210960  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 97) passed its deadline 319813.893s (passed: 5.038s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:20.224540  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 97) passed its deadline 319819.938s (passed: 5.008s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:20.253351  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 97) passed its deadline 319819.926s (passed: 5.049s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:26.306573  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 97) passed its deadline 319825.987s (passed: 5.041s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:26.321313  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 98) passed its deadline 319825.957s (passed: 5.086s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:32.378455  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 97) passed its deadline 319832.046s (passed: 5.054s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:32.406399  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 98) passed its deadline 319832.032s (passed: 5.096s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:38.471698  4676 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 98) passed its deadline 319838.100s (passed: 5.093s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:38.490912  4710 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 98) passed its deadline 319838.129s (passed: 5.083s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)
E1004 17:27:39.254457  4678 async_initializer.cc:83] Failed to initialize client: Timed out (yb/rpc/rpc.cc:199): Could not locate the leader master: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 242) passed its deadline 319838.893s (passed: 30.082s): Not found (yb/master/master_rpc.cc:279): no leader found: GetLeaderMasterRpc(addrs: [172.25.1.70:7100, 172.25.1.71:7100, 172.25.1.72:7100], num_attempts: 1)

Thank you!

What about the yb-master.INFO files or stdout/stderr for yb-master – any clues from that?

Thank you for your reply! No there is no stdout/stderr for either yb-master or yb-teserver.
The INFO logs seems fine to me. The only error I could found by grep are the network errors when it first started, where it cannot locate other nodes (7000 works in the beginning by the way). I also looked at the whole log and everything seems normal.

This is what I got from grep:

[root@localhost logs]# cat yb-master.INFO | grep err
I1004 17:25:29.351004 4650 server_base.cc:438] Could not load existing FS layout: Not found (yb/util/env_posix.cc:1405): /opt/yugabyteDB/data/yb-data/master/instance: No such file or directory (system error 2)
I1004 17:25:32.455368 4662 tcp_stream.cc:293] { local: 172.25.1.70:47444 remote: 172.25.1.71:7100 }: Recv failed: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:32.455616 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:35.462429 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:38.469288 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:41.475363 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:44.481546 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:47.487790 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:50.492579 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:53.498737 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:25:56.504977 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:26:02.518165 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:26:11.543656 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:26:23.570665 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:26:35.599573 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: No route to host (system error 113)
W1004 17:26:45.600355 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:26:55.607508 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:05.625592 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.71:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:15.631244 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:15.647925 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:15.690619 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:15.783140 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:15.913816 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:16.194615 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:16.741837 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:17.783450 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:19.867595 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:24.019454 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:32.240165 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)
W1004 17:27:42.372052 4683 leader_election.cc:275] T 00000000000000000000000000000000 P c54bde60541844fcbd09de310c77d110 [CANDIDATE]: Term 1 pre-election: RPC error from VoteRequest() call to peer b8b170155b454b8e833043878e9a4c36: Remote error (yb/rpc/outbound_call.cc:440): Service unavailable (yb/master/catalog_manager.cc:4717): CatalogManager is not yet initialized
W1004 17:27:42.407253 4686 leader_election.cc:275] T 00000000000000000000000000000000 P c54bde60541844fcbd09de310c77d110 [CANDIDATE]: Term 1 election: RPC error from VoteRequest() call to peer b8b170155b454b8e833043878e9a4c36: Remote error (yb/rpc/outbound_call.cc:440): Service unavailable (yb/master/catalog_manager.cc:4717): CatalogManager is not yet initialized
W1004 17:27:42.429343 4686 consensus_peers.cc:433] T 00000000000000000000000000000000 P c54bde60541844fcbd09de310c77d110 → Peer b8b170155b454b8e833043878e9a4c36 ([host: “172.25.1.71” port: 7100], ): Couldn’t send request. Status: Remote error (yb/rpc/outbound_call.cc:440): Service unavailable (yb/master/catalog_manager.cc:4717): CatalogManager is not yet initialized. Retrying in the next heartbeat period. Already tried 1 times. State: 2

Thank you!

An update here. I re-deployed the cluster and the UI was up for a short time, which is giving me not able to load leader information, and then it was inaccessible again.

@arnav, @dorian_yugabyte-- can you please followup with @AndrewLiuRM.

If need be, and if @AndrewLiuRM is open to it – please schedule a debug session over a conference call.

Also @AndrewLiuRM

can you post the output of:

% ps auxx | grep yb- 

on each of your nodes/docker containers?

Container 1

ps auxx | grep yb-
root      4649  0.7  0.1 973832 44908 ?        Sl   14:39   1:17 ./bin/yb-master --flagfile /opt/yugabyteDB/conf/master.conf
root      4650  0.3  0.1 741824 58396 ?        Sl   14:39   0:37 ./bin/yb-tserver --flagfile /opt/yugabyteDB/conf/tserver.conf
root      6185  0.0  0.0  11084   672 pts/0    S+   17:21   0:00 grep --color=auto yb-

Container 2

ps auxx | grep yb-
root      4649  0.2  0.1 560340 43032 ?        Sl   14:39   0:28 ./bin/yb-master --flagfile /opt/yugabyteDB/conf/master.conf
root      4650  0.3  0.1 707980 59296 ?        Sl   14:39   0:37 ./bin/yb-tserver --flagfile /opt/yugabyteDB/conf/tserver.conf
root      4743  0.0  0.0  11084   672 pts/0    S+   17:22   0:00 grep --color=auto yb-

Container 3

ps auxx | grep yb-
root      4641  0.2  0.1 534476 40568 ?        Sl   14:40   0:28 ./bin/yb-master --flagfile /opt/yugabyteDB/conf/master.conf
root      4642  0.3  0.1 705676 54164 ?        Sl   14:40   0:35 ./bin/yb-tserver --flagfile /opt/yugabyteDB/conf/tserver.conf
root      4737  0.0  0.0  11084   672 pts/0    S+   17:22   0:00 grep --color=auto yb-

I would love to set up a debug session if it is necessary! Thank you!

Nice that you are using the --flagfile (rather than pass all the options in command line); but to help further, it’ll be useful to have the contents of one node’s master.conf and tserver.conf files. Can you please share that?

Master Config:

--master_addresses=172.25.1.70:7100,172.25.1.71:7100,172.25.1.72:7100
--rpc_bind_addresses=172.25.1.70
--fs_data_dirs=/opt/yugabyteDB/data
--replication_factor=3

TServer Config:

--tserver_master_addrs=172.25.1.70:7100,172.25.1.71:7100,172.25.1.72:7100
--rpc_bind_addresses=172.25.1.70
--cql_proxy_bind_address=172.25.1.70:9042
--fs_data_dirs=/opt/yugabyteDB/data

Thank you!

1 Like

Hey @AndrewLiuRM! Do you mind posting the conf files for all 3 masters and all 3 tservers? I want to confirm that the bind addresses across the containers are different. Also, can you confirm that the respective IPs match the actual container IPs?

In particular:

W1004 17:27:24.019454 4680 consensus_peers.cc:609] Error getting permanent uuid from config peer [172.25.1.72:7100]: Network error (yb/util/net/socket.cc:538): recvmsg error: Connection refused (system error 111)

^ This error suggests that the masters are actually not able to talk to each other on port 7100, which is the port they need to use for RPC communications. Can you confirm that network connectivity between your containers is also correctly setup?

I can happily jump on a quick conf call with you sometime tomorrow, if that will help!

1 Like

Hi @bogdan, thank you for your reply!

--master_addresses=172.25.1.70:7100,172.25.1.71:7100,172.25.1.72:7100
--rpc_bind_addresses=172.25.1.71
--fs_data_dirs=/opt/yugabyteDB/data
--replication_factor=3

--tserver_master_addrs=172.25.1.70:7100,172.25.1.71:7100,172.25.1.72:7100
--rpc_bind_addresses=172.25.1.71
--cql_proxy_bind_address=172.25.1.71:9042
--fs_data_dirs=/opt/yugabyteDB/data



--master_addresses=172.25.1.70:7100,172.25.1.71:7100,172.25.1.72:7100
--rpc_bind_addresses=172.25.1.72
--fs_data_dirs=/opt/yugabyteDB/data
--replication_factor=3

--tserver_master_addrs=172.25.1.70:7100,172.25.1.71:7100,172.25.1.72:7100
--rpc_bind_addresses=172.25.1.72
--cql_proxy_bind_address=172.25.1.72:9042
--fs_data_dirs=/opt/yugabyteDB/data

These are the config files I have, hope it helps. The ips match their own container IPs and I’m sure the containers are connected as I created a keyspace in one container and I am able to see the created keyspace in another container.

That specific error you are talking about may be because I created the containers one by one and the time lag between the creation of these containers results in that they are not able to find each other, since not all of them exists at that time.

I’m free to set up a conf call any time today!

Thank you!

@AndrewLiuRM Ok, those look fine. Perhaps this is just a webserver issue, if you’re saying RPCs were working just fine. Can you try to add one more flag to both masters and tservers, --webserver_interface=$IP, where $IP is the respective IP of each of the containers.

I think by default that goes to 0.0.0.0, which might pick up the hostname instead of IP, potentially resulting in connectivity issues.

If that does not work, I’m happy to hop on a call. How does 2pm PST sound? Also, if you are ok with Slack, we can chat in more real time there: Join yugabyte-db on Slack - Community Inviter

1 Like

It works! Thank you!

1 Like