right now, we want to start with a single data center yugabyte deployment.
In the future we plan to support multiple physical data centers that are geo replicated.
But we are not sure how-to seamlessly move forward from a single DC setup to an multi DC setup with geo replication. Is it possible to simply add an existing cluster to a geo replicated setup and yugabyte automatically spread the data into the new DCs? Or what would be the best way to do so?
You can add new nodes from other DC into the cluster and data will be repliated automatically and all nodes (from both DC) will allow read and write. So long as all nodes can communicate with each other at network level, it should just work. Data replication will be sunchronous in this case. So query time will include the network latency.
If the elevated transaction time is not acceptable due to high latency between the DC (example >2ms) you can consider async replication. This is done via creating a new cluster in the new DC2 and setting up one way or two way sync.
Depending on your application/site requirement
Hot/Hot, Hot/Warm, etc, query time requirement, rpo/rto requirement, you can adopt different deployment topologies.