YB Data Migration

I have a question about data migration. We’re moving from Data Center ‘A’ to Data Center ‘B’, and we have a 3-node YugabyteDB cluster set up in Data Center ‘A’. We want to migrate the data from the cluster in Data Center ‘A’ to the cluster (yet to be set up) in Data Center ‘B’. Could you please recommend the best approach for this migration? Thank you!

Hi @parvez

The best way is described in Change cluster configuration | YugabyteDB Docs.

You add all new nodes in DC-B, move data there, and remove DC-A nodes.

Hi @parvez

I wanted to check how the existing clusters are deployed. Do you use yugabyted CLI or yb-master/yb-tserver CLI. Depending on this, some of the steps may differ on how to start the 2nd cluster with the right flags.

Can you please provide us with details of your current deployment?

Thanks,
Nikhil Chandrappa
yugabyted Product Lead

Hi @nmalladi, We are using yugabyted to start, stop, and status

Hi @dorian_yugabyte, Thanks for sharing the link. Let me go through it.

Hi @parvez ,

Thanks for the details.

From the docs that Dorian shared,

  • For steps 2 - 3, you need to follow yugabyted steps for starting the nodes. You don’t need to rerun the yugabyted configure command after the new nodes in DC2 are started. It will be configured in the below step.

  • From steps 4 -7, you can follow the documented steps.

Hi @nmalladi, Got it. Thank you!

I also came across this document Back up and restore data | YugabyteDB Docs and Distributed snapshots for YSQL | YugabyteDB Docs

any thoughts ?

Snapshot + restore works too but it will be offline migration.

Hi @nmalladi, would you be able to share ‘yugabyted’ commands for steps 2-3 ( just wanted to make sure, I got it right) Change cluster configuration | YugabyteDB Docs

or is there any way to start only yb-master or yb-tserver using yugabyted ?

There is not at the moment. After the RF is satisfied, yugabyted spawns dormant yb-masters that aren’t joined to the cluster but will join if you lose the current ones.

Hi @dorian_yugabyte / @nmalladi – On a new node, at Step-2, when starting yb-master process - logging as the below attached error.

Application fingerprint: version 2.20.5.0 build 72 revision cebde5e50c0865614b4de917dd365e65d272499b build_type RELEASE built at 03 Jul 2024 22:41:00 UTC
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0210 22:36:36.524729 1171930 server_main_util.cc:68] NumCPUs determined to be: 4
I0210 22:36:36.524901 1171930 mem_tracker.cc:255] Creating root MemTracker with garbage collection threshold 8019726 bytes
I0210 22:36:36.524911 1171930 mem_tracker.cc:259] Root memory limit is 801972633
I0210 22:36:36.524922 1171930 tcmalloc_util.cc:231] Setting tcmalloc max thread cache bytes to: 33554432
I0210 22:36:36.524935 1171930 tcmalloc_util.cc:264] Setting TCMalloc profiler sampling frequency to 1048576 bytes
I0210 22:36:36.524943 1171930 mem_tracker.cc:213] TCMalloc per cpu caches active: 1
I0210 22:36:36.524950 1171930 mem_tracker.cc:215] TCMalloc max per cpu cache size: 1572864
I0210 22:36:36.524956 1171930 mem_tracker.cc:217] TCMalloc max total thread cache bytes: 33554432
I0210 22:36:36.525018 1171930 server_base_options.cc:176] Updating master addrs to
I0210 22:36:36.525027 1171930 server_base_options.cc:176] Updating master addrs to
I0210 22:36:36.525043 1171930 server_base_options.cc:176] Updating master addrs to
I0210 22:36:36.525272 1171930 shared_mem.cc:139] Using memfd_create as a shared memory provider
I0210 22:36:36.525358 1171930 mem_tracker.cc:795] MemTracker: hard memory limit is 0.746895 GB
I0210 22:36:36.525384 1171930 mem_tracker.cc:797] MemTracker: soft memory limit is 0.634861 GB
I0210 22:36:36.525480 1171930 thread_pool.cc:165] Starting thread pool { name: raft_notifications max_workers: 18446744073709551615 }
I0210 22:36:36.526129 1171930 server_base_options.cc:176] Updating master addrs to
I0210 22:36:36.526275 1171930 rpc_server.cc:84] yb::server::RpcServer created at 0x17cebfc3bc20
I0210 22:36:36.526283 1171930 master.cc:185] yb::master::Master created at 0x7ffd7d547ec0
I0210 22:36:36.526288 1171930 master.cc:186] yb::master::TSManager created at 0x17cebfc13b60
I0210 22:36:36.526293 1171930 master.cc:187] yb::master::CatalogManager created at 0x17cebf607b00
I0210 22:36:36.526327 1171930 master_main.cc:139] Initializing master server...
I0210 22:36:36.526845 1171930 server_base.cc:526] Could not load existing FS layout: Not found (yb/fs/fs_manager.cc:402): Metadata wasn't found
I0210 22:36:36.526865 1171930 server_base.cc:527] Creating new FS layout
I0210 22:36:36.529985 1171930 fs_manager.cc:636] Generated new instance metadata in path /storage/yugabyte_data/data/yb-data/master/instance:
uuid: "621316ca7aca4ee89acbacf467bac27f"
format_stamp: "Formatted at 2025-02-10 22:36:36 on 127.0.0.1"
initdb_done_set_after_sys_catalog_restore: true
I0210 22:36:36.531095 1171930 fs_manager.cc:415] Opened local filesystem: /storage/yugabyte_data/data
uuid: "621316ca7aca4ee89acbacf467bac27f"
format_stamp: "Formatted at 2025-02-10 22:36:36 on 127.0.0.1"
initdb_done_set_after_sys_catalog_restore: true
I0210 22:36:36.531140 1171930 secure.cc:149] SetupSecureContext: kInternal, 0
I0210 22:36:36.531816 1171930 thread_pool.cc:165] Starting thread pool { name: auto_flags_client max_workers: 1024 }
I0210 22:36:36.532438 1171930 master.cc:246] AutoFlags initialization delayed as master is in Shell mode.
I0210 22:36:36.532644 1171930 server_base.cc:286] Auto setting FLAGS_num_reactor_threads to 4
I0210 22:36:36.532655 1171930 secure.cc:149] SetupSecureContext: kInternal, 0
I0210 22:36:36.533114 1171930 thread_pool.cc:165] Starting thread pool { name: Master max_workers: 1024 }
I0210 22:36:36.533905 1171930 server_base.cc:269] Running on host: localhost
I0210 22:36:36.533998 1171952 async_initializer.cc:82] Starting to init ybclient
I0210 22:36:36.534032 1171930 master_main.cc:142] Starting Master server...
I0210 22:36:36.534044 1171930 ulimit_util.cc:212] Configured soft limit for cpu time is already larger than specified min value (unlimited vs. unlimited). Skipping.
I0210 22:36:36.534055 1171930 ulimit_util.cc:212] Configured soft limit for file size is already larger than specified min value (unlimited vs. unlimited). Skipping.
I0210 22:36:36.534060 1171930 ulimit_util.cc:212] Configured soft limit for data seg size is already larger than specified min value (unlimited vs. unlimited). Skipping.
I0210 22:36:36.534065 1171930 ulimit_util.cc:212] Configured soft limit for stack size is already larger than specified min value (8388608 vs. 8388608). Skipping.
I0210 22:36:36.534070 1171930 ulimit_util.cc:212] Configured soft limit for max user processes is already larger than specified min value (30411 vs. 12000). Skipping.

I0210 22:36:36.534082 1171930 ulimit_util.cc:212] Configured soft limit for max locked memory is already larger than specified min value (65536 vs. 65536). Skipping.
I0210 22:36:36.534086 1171930 ulimit_util.cc:212] Configured soft limit for max memory size is already larger than specified min value (unlimited vs. unlimited). Skipping.
I0210 22:36:36.534091 1171930 master_main.cc:144] ulimit cur(max)...
ulimit: core file size 0(0) blks
ulimit: data seg size unlimited(unlimited) kb
ulimit: open files 262144(262144)
ulimit: file size unlimited(unlimited) blks
ulimit: pending signals 30411(30411)
ulimit: file locks unlimited(unlimited)
ulimit: max locked memory 64(64) kb
ulimit: max memory size unlimited(unlimited) kb
ulimit: stack size 8192(unlimited) kb
ulimit: cpu time unlimited(unlimited) secs
ulimit: max user processes 30411(30411)
I0210 22:36:36.534276 1171930 service_pool.cc:147] yb.master.MasterBackup: yb::rpc::ServicePoolImpl created at 0x17cebfc62480
I0210 22:36:36.534335 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc62900
I0210 22:36:36.534386 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc63200
I0210 22:36:36.534473 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc626c0
I0210 22:36:36.534515 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc638c0
I0210 22:36:36.534752 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc63440
I0210 22:36:36.534773 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc63680
I0210 22:36:36.534783 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebf4b8000
I0210 22:36:36.534857 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc63b00
I0210 22:36:36.534868 1171930 service_pool.cc:147] yb.master.MasterService: yb::rpc::ServicePoolImpl created at 0x17cebfc63d40
W0210 22:36:36.534917 1171954 catalog_manager.cc:1705] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:12215): Node 621316ca7aca4ee89acbacf467bac27f peer not initialized.
I0210 22:36:36.535130 1171954 client-internal.cc:2619] New master addresses: []
I0210 22:36:36.535166 1171930 service_pool.cc:147] yb.tserver.TabletServerService: yb::rpc::ServicePoolImpl created at 0x17cebf4b8d80
I0210 22:36:36.535215 1171930 thread_pool.cc:165] Starting thread pool { name: Master-high-pri max_workers: 1024 }
I0210 22:36:36.535226 1171930 service_pool.cc:147] yb.consensus.ConsensusService: yb::rpc::ServicePoolImpl created at 0x17cebf4b8b40
E0210 22:36:36.535269 1171952 async_initializer.cc:95] Failed to initialize client: Illegal state (yb/client/client-internal.cc:2622): Could not locate the leader master: Unable to determine master addresses
I0210 22:36:36.535421 1171930 service_pool.cc:147] yb.tserver.RemoteBootstrapService: yb::rpc::ServicePoolImpl created at 0x17cebf4b9b00
I0210 22:36:36.536218 1171930 service_pool.cc:147] yb.tserver.PgClientService: yb::rpc::ServicePoolImpl created at 0x17cebf372240
I0210 22:36:36.536284 1171930 webserver.cc:338] Starting webserver on 127.0.0.1:7000
I0210 22:36:36.536290 1171930 webserver.cc:347] Document root: /opt/yugabyte/yugabyte-2.20.5.0/www
I0210 22:36:36.536303 1171930 webserver.cc:333] Webserver listen spec is 127.0.0.1:7000
I0210 22:36:36.536422 1171930 webserver.cc:453] Webserver started. Bound to: http://127.0.0.1:7000/
I0210 22:36:36.536460 1171930 service_pool.cc:147] yb.server.GenericService: yb::rpc::ServicePoolImpl created at 0x17cebf4b9d40
I0210 22:36:36.536649 1171930 rpc_server.cc:167] RPC server started. Bound to: 127.0.0.1:7100
I0210 22:36:36.536916 1171930 server_base.cc:375] Dumped server information to /storage/yugabyte_data/data/master-info
I0210 22:36:36.537009 1171958 async_initializer.cc:82] Starting to init ybclient
I0210 22:36:36.537056 1171930 db_server_base.cc:68] Node information: { hostname: 'localhost', rpc_ip: '127.0.0.1', webserver_ip: '127.0.0.1', uuid: '621316ca7aca4ee89acbacf467bac27f' }
I0210 22:36:36.537067 1171930 server_base.cc:587] Using private rpc address 127.0.0.1
I0210 22:36:36.537073 1171930 server_base.cc:609] Using http address 127.0.0.1
I0210 22:36:36.537220 1171959 sys_catalog.cc:269] Trying to load previous SysCatalogTable data from disk

I0210 22:36:36.537262 1171959 catalog_manager.cc:2172] Starting master in shell mode.
I0210 22:36:36.537276 1171959 server_base.cc:587] Using private rpc address 127.0.0.1
I0210 22:36:36.537320 1171930 master_main.cc:147] Master server successfully started.
I0210 22:36:36.537627 1171961 total_mem_watcher.cc:76] Root memtracker limit: 801972633 (764 MiB); this server will stop if memory usage exceeds 200% of that: 1603945266 bytes (1529 MiB).
W0210 22:36:36.537758 1171962 catalog_manager.cc:1705] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:12215): Node 621316ca7aca4ee89acbacf467bac27f peer not initialized.
I0210 22:36:36.537782 1171962 client-internal.cc:2619] New master addresses: []
E0210 22:36:36.537833 1171958 async_initializer.cc:95] Failed to initialize client: Illegal state (yb/client/client-internal.cc:2622): Could not locate the leader master: Unable to determine master addresses
W0210 22:36:37.535727 1171963 catalog_manager.cc:1705] Failed to get current config: Illegal state (yb/master/catalog_manager.cc:12215): Node 621316ca7aca4ee89acbacf467bac27f peer not initialized.
I0210 22:36:37.535782 1171963 client-internal.cc:2619] New master addresses: []

I0210 22:40:38.654357 1172885 client-internal.cc:2619] New master addresses: []
E0210 22:40:38.654500 1171952 async_initializer.cc:95] Failed to initialize client: Illegal state (yb/client/client-internal.cc:2622): Could not locate the leader master: Unable to determine master addresses

When executing step3, t-server process did not start, I’m attaching below logs:

I0211 01:49:38.039788 1300692 client-internal.cc:2619] New master addresses: [node1:7100,node2:7100,node3:7100,node4:7100,node5:7100,node6:7100]
W0211 01:50:38.217491 1294688 auto_flags_manager.cc:218] Loading AutoFlags from master Leader failed: 'Timed out (yb/rpc/rpc.cc:220): Could not locate the leader master: GetLeaderMasterRpc(addrs: [node1:7100,node2:7100,node3:7100,node4:7100,node5:7100,node6:7100], num_attempts: 293) passed its deadline 429985.308s (passed: 60.177s): Not found (yb/master/master_rpc.cc:286): no leader found: GetLeaderMasterRpc(addrs: [node1:7100,node2:7100,node3:7100,node4:7100,node5:7100,node6:7100], num_attempts: 1)'. Attempts: 32, Total Time: 1973684ms. Retrying...```

I checked cluster-1 is up and running, ports are open as well. any leads on this can be helpful. thanks!

Please describe exactly which commands you ran.

Hi @dorian_yugabyte, Thanks for your response! please find details below. The commands were ran on the new node

start yb-master ran the below command;

yugabyte-2.20.5.0/bin/yb-master  --fs_data_dirs=/storage/yugabyte_data/data --webserver_interface=127.0.0.1  --rpc_bind_addresses=127.0.0.1:7100 --server_broadcast_addresses=127.0.0.1:7100 --server_dump_info_path=/storage/yugabyte_data/data/master-info --webserver_port=7000  --ysql_conn_mgr_port=5433

start yb-tserver, ran the below command –

export MASTERS=node1:7100,node2:7100,node3:7100,node4:7100,node5:7100,node6:7100

yugabyte-2.20.5.0/bin/yb-tserver --fs_data_dirs=/storage/yugabyte_data/data --webserver_interface=127.0.0.1 --rpc_bind_addresses=127.0.0.1:9100 --server_broadcast_addresses=127.0.0.1:9100 --cql_proxy_bind_address=127.0.0.1:9042 --server_dump_info_path=/storage/yugabyte_data/data/tserver-info --start_pgsql_proxy --start_redis_proxy=false --pgsql_proxy_bind_address=127.0.0.1:5433 --webserver_port=9000 --tserver_master_addrs=$MASTERS --certs_dir=/etc/conf/yugabyte/certs/

It’s a bit unclear, are you using yugabyted or not?

The commands which i provided above, i ran on the new nodes. Following this document - Change cluster configuration | YugabyteDB Docs
I’m not using yugabyted on the new nodes to start, would you suggest to use yugabyted ?

@parvez Here are the steps we tried out internally. You can use these steps by updating the advertise_address, cloud_location and base_dir which you have in your setup.

  • Please let us know in which of these below steps you are facing the issue.
  • If you have started original cluster with yugabyted, you cannot use yb-master and yb-tserver CLI for started new set of nodes. It is not supported.

Started rf-3 cluster

./bin/yugabyted start --base_dir=/tmp/ybd1 --advertise_address=127.0.0.1 --cloud_location=aws.us-east1.us-east-2a
./bin/yugabyted start --base_dir=/tmp/ybd2 --advertise_address=127.0.0.2 --join=127.0.0.1 --cloud_location=aws.us-east1.us-east-2a
./bin/yugabyted start --base_dir=/tmp/ybd3 --advertise_address=127.0.0.3 --join=127.0.0.1 ----cloud_location=aws.us-east1.us-east-2a

Added some dummy data to the cluster

Add 3 nodes

./bin/yugabyted start --base_dir=/tmp/ybd4 --advertise_address=127.0.0.4 --join=127.0.0.1 --cloud_location=aws.us-east2.us-east-2a
./bin/yugabyted start --base_dir=/tmp/ybd5 --advertise_address=127.0.0.5 --join=127.0.0.1 --cloud_location=aws.us-east2.us-east-2a
./bin/yugabyted start --base_dir=/tmp/ybd6 --advertise_address=127.0.0.6 --join=127.0.0.1 --cloud_location=aws.us-east2.us-east-2a

Blacklist old nodes

export MASTERS=127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100
./build/latest/bin/yb-admin -master_addresses $MASTERS change_blacklist ADD 127.0.0.1:9100 127.0.0.2:9100 127.0.0.3:9100

Wait until this command gives 100% (all data moved to new nodes)

./build/latest/bin/yb-admin -master_addresses $MASTERS get_load_move_completion

Run these 1 at a time, check master UI to make sure masters are added/removed

export 
MASTERS=127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100,127.0.0.4:7100,127.0.0.5:7100,127.0.0.6:7100
./build/latest/bin/yb-admin -master_addresses $MASTERS change_master_config ADD_SERVER 127.0.0.4 7100
./build/latest/bin/yb-admin -master_addresses $MASTERS change_master_config REMOVE_SERVER 127.0.0.1 7100
./build/latest/bin/yb-admin -master_addresses $MASTERS change_master_config ADD_SERVER 127.0.0.5 7100
./build/latest/bin/yb-admin -master_addresses $MASTERS change_master_config REMOVE_SERVER 127.0.0.2 7100
./build/latest/bin/yb-admin -master_addresses $MASTERS change_master_config ADD_SERVER 127.0.0.6 7100
./build/latest/bin/yb-admin -master_addresses $MASTERS change_master_config REMOVE_SERVER 127.0.0.3 7100

Confirm list of masters only contains new nodes

export MASTERS=127.0.0.4:7100,127.0.0.5:7100,127.0.0.6:7100
./build/latest/bin/yb-admin -master_addresses $MASTERS list_all_masters

Restart nodes to update the master addrs on each tserver

./bin/yugabyted stop --base_dir=/tmp/ybd4
./bin/yugabyted start --base_dir=/tmp/ybd4
./bin/yugabyted stop --base_dir=/tmp/ybd5
./bin/yugabyted start --base_dir=/tmp/ybd5
./bin/yugabyted stop --base_dir=/tmp/ybd6
./bin/yugabyted start --base_dir=/tmp/ybd6

Once everything is OK on master/tserver UI we can delete the old nodes and remove them from blacklist

./bin/yugabyted destroy --base_dir=/tmp/ybd1
./bin/yugabyted destroy --base_dir=/tmp/ybd2
./bin/yugabyted destroy --base_dir=/tmp/ybd3

Remove blacklist

./build/latest/bin/yb-admin -master_addresses $MASTERS change_blacklist REMOVE 127.0.0.1:9100 127.0.0.2:9100 127.0.0.3:9100

Hi @nmalladi, Thank you for sharing the detailed steps. I’ll give them a try and let you know how it goes!

In our setup, Cluster-1 is started with ‘yugabyted’, while for Cluster-2, I’m trying to start it with ‘yb-master’ and ‘yb-tserver’. I appreciate your confirmation that this approach is not supported.

Hi @nmalladi, I have a few questions about the above process, please educate me here:

  1. At step ‘Add 3 Nodes’ - The new set of nodes (node4, node5, node6) will join cluster-1(node1,node2,node3) since RF=3 is met, the new set of nodes (node4, node5, node6) will have ‘T-server’ process running, right? Now the master leader will have all 6 tablet servers (node1 to node6)
    1a) In this case, what happens on the new set of nodes ( node4 to node6) is data being re-distributed (rebalancing) ?
    1b) Will the new set of nodes are in Active-Active state ( Accepts Read and Write Operations)

  2. At step ’ Blacklist old nodes ’
    a) what is blacklist ? What is happening with executing the blacklist?
    b) When data is moving from the old cluster(node1-node3) to the new cluster (node4-node6);

  • How can I monitor the process? Is this being logged?
  • If the data move is not improving/stalled/stuck. Is there any way to re-trigger the process? Any troubleshooting documentation?
  • During the data move, what will happen to the application queries? since the data is still in the migrating phase.
  • Is there a way to selectively migrate Databases?
  1. Rollback procedure?
  2. Should this process require downtime at any stage?

Thank you!

Yes. They will also have a “sleeping” yb-master that is ready to join the cluster if you lose one.

All yb-tservers join the cluster. The metadata will be in the yb-master leader & replicated to yb-master peers.

Yes. By default, the cluster tries to load balance tablet-leaders & peers across all yb-tservers. So if you lose any yb-tserver you have as little downtime as possible (only 1/n of the tablet leaders are offline until leaders are migrated to other nodes).

All nodes will accept (& redirect) writes & reads. By default, all writes/reads are served by the tablet-leader where the row resides.

See yb-admin - command line tool for advanced YugabyteDB administration | YugabyteDB Docs. You are adding the yb-tserver to a blacklist so it doesn’t host data and starts migrating it’s tablets to other yb-tservers.

The cluster is the same. You’re just adding new nodes. First it balances all tablets across all nodes. Then you add the old nodes to the blacklist. It will start to balance tablets evenly to all non-blacklisted servers.

See yb-admin - command line tool for advanced YugabyteDB administration | YugabyteDB Docs

The load balancer runs all the time, the trigger is automatic. In the UI it will show “Cluster not balanced”. We have to see logs why it’s not balancing.

Worst case you might get a failing query that can be retried. But should be transparent to the app. See Transaction retries in YSQL | YugabyteDB Docs for general cases.

Migration happens on a per-tablet basis. And it is throttled so it doesn’t happen too slow or too fast. There might be a way with a complex tablespaces setup. But why do you think you might need this?

There’s no “rollback”. You added the new servers, they look fine, you removed the old servers.
You can do the reverse using the same steps, add the old servers back and remove the new servers.

No. Downtime will be when a server goes down unexpectedly. In this case, we remove all data from the yb-tservers (by adding to the blacklist) before turning them off.