Hi Dorian
It is blocked maybe, could not get it from browser. However from Yuga UI, I could not see any cluster config option. Only xCluster config option i can see.
How can it be blocked? You already opened it with curl. If it is somehow, just save the curl response in a .HTML file, open it with firefox, and use “Fireshot” plugin to make a full page screenshot. (or upload the .HTML response here)
Some server has more tablet peers/leaders, Insertion is skewed though, but cluster load is not balanced for too long.
Also noticed transactions table as Leaderless Tablets under replica info. Is it something to worried about?
Thanks @subh14 for sharing the screenshots. I can see that the Load balancer is active and attempting some operations but can’t infer from the screenshots why it isn’t able to converge. Would it be possible to share the log of master leader ?
If that is not an option, could we hop on a call together to debug further. Without the logs, it would be hard to debug what the load balancer is stuck on.
Hi Sandeep
Thanks for your reply. I have checked yb-master log of the leader and found below log repeating itself.
I0416 18:18:25.131608 1406131 cluster_balance.cc:705] tablet server 25901c23d9684e338dc2029f47f5a2bc has a pending delete for tablets [c2d5cfb588144417950446e5ebf4ff6d]
W0416 18:17:39.883428 1406461 catalog_manager.cc:13051] ProcessTabletReplicaFullCompactionStatus: Not found (yb/master/catalog_manager.cc:3115): Tablet 84fed376dd814b55a326fb70ff2d93fa not found
W0416 18:17:39.883441 1406461 catalog_manager.cc:13051] ProcessTabletReplicaFullCompactionStatus: Not found (yb/master/catalog_manager.cc:3115): Tablet 816ac2ecb46e4969a51f6125c8890f64 not found
W0416 18:17:39.883452 1406461 catalog_manager.cc:13051] ProcessTabletReplicaFullCompactionStatus: Not found (yb/master/catalog_manager.cc:3115): Tablet e5abae4c5dc24404b868a5b517b157ac not found
W0416 18:17:39.883477 1406461 catalog_manager.cc:13051] ProcessTabletReplicaFullCompactionStatus: Not found (yb/master/catalog_manager.cc:3115): Tablet 9cd34d9cd68c4b72ace16c2e3a187174 not found
W0416 18:17:39.883495 1406461 catalog_manager.cc:13051] ProcessTabletReplicaFullCompactionStatus: Not found (yb/master/catalog_manager.cc:3115): Tablet 2887965d7fd14c39865f11524fe70888 not found
W0416 18:17:39.883507 1406461 catalog_manager.cc:13051] ProcessTabletReplicaFullCompactionStatus: Not found (yb/master/catalog_manager.cc:3115): Tablet a980ea3d2fc8440386766fdfc6135460 not found
I0416 18:17:40.507510 1406131 cluster_balance.cc:438] Total pending adds=1, total pending removals=0, total pending leader stepdowns=0
I0416 18:17:40.507557 1406131 cluster_balance.cc:705] tablet server 25901c23d9684e338dc2029f47f5a2bc has a pending delete for tablets [c2d5cfb588144417950446e5ebf4ff6d]
I0416 18:17:41.522347 1406131 cluster_balance.cc:438] Total pending adds=1, total pending removals=0, total pending leader stepdowns=0
I0416 18:17:41.522392 1406131 cluster_balance.cc:705] tablet server 25901c23d9684e338dc2029f47f5a2bc has a pending delete for tablets [c2d5cfb588144417950446
W0416 18:18:25.595906 1445955 catalog_manager.cc:13051] ProcessTabletReplicaFullCompactionStatus: Not found (yb/master/catalog_manager.cc:3115): Tablet 85cea
09d797340029c73fa2c85ea1135 not found
I0416 18:18:26.144498 1406131 cluster_balance.cc:438] Total pending adds=1, total pending removals=0, total pending leader stepdowns=0
I0416 18:18:26.144531 1406131 cluster_balance.cc:705] tablet server 25901c23d9684e338dc2029f47f5a2bc has a pending delete for tablets [c2d5cfb588144417950446
e5ebf4ff6d]
I0416 18:18:27.159514 1406131 cluster_balance.cc:438] Total pending adds=1, total pending removals=0, total pending leader stepdowns=0
I0416 18:18:27.159595 1406131 cluster_balance.cc:705] tablet server 25901c23d9684e338dc2029f47f5a2bc has a pending delete for tablets [c2d5cfb588144417950446
e5ebf4ff6d]
I0416 18:18:27.171049 1406131 catalog_manager.cc:3135] Got tablet to split: c22cc12c730d47e9a1bb863d3a5aa8be, is manual split: 0
I0416 18:18:27.171327 1406131 tablet_split_manager.cc:442] Scheduled split for tablet_id: c22cc12c730d47e9a1bb863d3a5aa8be.
W0416 18:18:27.172641 1758785 async_rpc_tasks.cc:1482] Get Tablet Split Key RPC for tablet 0x000017d1bab3e000 → c22cc12c730d47e9a1bb863d3a5aa8be (table _rmtmmidmobile_202404_idx [id=00004300000030008000000000004f17]) (_rmtmmidmobile_202404_idx [id=00004300000030008000000000004f17]) (task=0x000017d1be6278
20, state=kRunning): TS 112babe077c44d4897c4337dcbe13541: GetSplitKey (attempt 1) failed for tablet c22cc12c730d47e9a1bb863d3a5aa8be with error code TABLET_SPLIT_KEY_RANGE_TOO_SMALL: Illegal state (yb/tablet/tablet.cc:3946): Failed to detect middle key for tablet c22cc12c730d47e9a1bb863d3a5aa8be (key_bounds: “47E149” - “”): got “47E149”.: TABLET_SPLIT_KEY_RANGE_TOO_SMALL (tablet server error 31)
I0416 18:18:27.172710 1758785 catalog_manager.cc:3162] Tablet key range is too small to split, disabling splitting temporarily.
I0416 18:18:27.172735 1758785 async_rpc_tasks.cc:387] Get Tablet Split Key RPC for tablet 0x000017d1bab3e000 → c22cc12c730d47e9a1bb863d3a5aa8be (table _rmtmmidmobile_202404_idx [id=00004300000030008000000000004f17]) (_rmtmmidmobile_202404_idx [id=00004300000030008000000000004f17]) (task=0x000017d1be627820, state=kFailed): No reschedule for this task: kFailed
And Load balancer status is rebalancing for last 8/10 days.
Hi Dorian
I have replication factor of 3 inside master.conf. And while checking the system transaction table from UI as below. Although it says it has a leader though, it does not show.
I have a seperate wal mount point pointed for wals inside tserver.conf. It was getting full so I deleted some of the previous wal files few days back.
Is there something wrong happend because of deleting the wals?
Is it safe to delete previous wal files?
Additionally, now I have used modify_placement_info command. So, here we have 12 Tservers and 3 Masters, with replication factor 3 in master.conf.
Since, cluster load was not balanced for a long time, I used modify_plcament_info as below to check.
Will it impact insertion or updation in the table.?
I have automatic_tablet_splitting enable, will this command impact this?
Is there a way to get back to previous state before running this command? Not sure about the previous state either.
Also, when i added additional 3 Tservers to make it total 12 Tserver, I see System Tablet-peers/Leaders in newly added servers are 0/0. Is there a way to balance this? Or its fine if it stays this way?