Thank you so much for your responses.
I created very “simple” table.
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE TABLE IF NOT EXISTS test_table (
id UUID default gen_random_uuid(),
var1 character varying(255),
var2 character varying(255),
var3 character varying(255),
var4 character varying(255),
var5 character varying(255),
var6 character varying(255),
var7 character varying(255),
var8 character varying(255),
var9 character varying(255),
var10 character varying(255),
var11 character varying(255),
var12 character varying(255),
var13 character varying(255),
val1 integer,
val2 integer,
val3 integer,
val5 integer,
val5 integer,
val6 integer);
Adding PRIMARY(id), made little difference since, i’m only doing insert only rates…
Hardcoded row into python script that can generated same row repeatedly (id will be different for each row inserted) and create csv file:
row = “var1 | var2 | var3 | var4 | var5 | var6 | var7 | var8 | var9 | var10 | var11 | var12 | var13 | 1 | 2 | 3| 4 | 5 | 6”
Create CSV file with desired number of rows.
Call copy_from function to handle insert of CSV file.
I created another python script that can run copy_from of the same CSV file using threads (threading).
Capture start and stop time when all copy_from has completed and calculate total rate.
I run the same test script connecting to yugabyte or postgresql, both setup with single drive/node (same ssd drive /data01). yugabyte rate is extremely less, i just ran 1500 row 5928200 bytes, yugabyte got about 4 MB/s and postgresql got about 31 MB/s. For both tests, I run using 8 threads calling the same CSV file. For every test I delete all the data in the table before rerunning.
Will try very very small file with many many copy_from…