Hi I have 3 node yugabyte cluster on AWS with i3.4xlarge instance type and replication factor of 3.
I have been trying to do simple copy with a csv file but it hangs every time and I am not sure what I am doing wrong.
The csv is 6.6 GB in size and has 58566067 count of rows.
Also there is no errors logs but the COPY command did exit unexpectedly with error:
You are now connected to database "pdns" as user "yugabyte".
ysqlsh:copy.sql:2: WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
ysqlsh:copy.sql:2: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
I retried but no luck and I know the writes are not happening because I went to the UI and write ops/sec is staying at 0. If anyone could help that would be great. Thanks
Since the COPY command is transactional, it tries to import all 58M rows in 1 large transaction and that’s leading to the issue that you’re seeing. This is something that we are actively looking to fix in 2.1 release:
In the meantime, there are 2 options to work around this issue:
If your schema does not have any indexes, then you can restart yb-tserver with gflag ysql_non_txn_copy=true and import the schema using COPY.
If your schema has secondary indexes, then the above non-transactional copy won’t work. So, in this case, you can use multiple insert statements instead of using COPY for loading the data. How did you create the csv file? If you have this data in another postgres cluster, then you can use ysql_dump to dump the DB contents into SQL statements and then import that into yugabyteDB.