SELECT with yb_hash_code() and DELETE

We have below schema in postgres (yugabyte DB 2.8.3) using YSQL:

CREATE TABLE IF NOT EXISTS public.table1

(

customer_id uuid NOT NULL ,

item_id uuid NOT NULL ,

kind character varying(100) NOT NULL ,

details character varying(100) NOT NULL ,
created_date timestamp without time zone NOT NULL,
modified_date timestamp without time zone NOT NULL,

CONSTRAINT table1_pkey PRIMARY KEY (customer_id, kind, item_id)

);

CREATE UNIQUE INDEX IF NOT EXISTS unique_item_id ON table1(item_id);

CREATE UNIQUE INDEX IF NOT EXISTS unique_item ON table1(customer_id, kind) WHERE kind='NEW' OR kind='BACKUP';


CREATE TABLE  IF NOT EXISTS public.item_data
(
    item_id uuid NOT NULL,
    id2 integer NOT NULL,
    create_date timestamp without time zone NOT NULL,
    modified_date timestamp without time zone NOT NULL,
        CONSTRAINT item_data_pkey PRIMARY KEY (item_id, id2)
);


Goal:
Step 1) Select item_id’s from table1 WHERE modified_date < someDate
Step 2) DELETE FROM table item_data WHERE item_id = any of those item_id’s from step 1


Currently we use query SELECT item_id FROM table1 WHERE modified_date < $1 ,

Can the SELECT query apply yb_hash_code(item_id) with SELECT query? Because table1 is indexed on item_id ? to enhance performance of the SELECT query

Currently we perform,
DELETE FROM item_data x WHERE x.item_id IN the listOfItemIds(provided in Step1 above).

With the given listOfItemIds, can we use yb_hash_code(item_id) to enhance performance of DELETE operation? Syntax…

Hi Sham,

Why not directly delete from public.table1 where kind='OLD' ?

@FranckPachot

My bad…
it is modified_date but not kind in WHERE clause

Updated question

Yes, it should work out. Something like:

SELECT item_id FROM item_data WHERE yb_hash_code(customer_id, kind, item_id) <= 128 AND yb_hash_code(customer_id, kind, item_id) >= 0 AND modified_date < x;

While you can combine the SELECT + DELETE in 1 query (like a subselect), this is probably better because it will result in smaller transactions.

No need to use yb_hash_code. The db should be able to find the correct rows since you’re sending the columns that are used for partitioning.

1 Like