Avoiding hot-spots on timestamp range indexes?

[Question posted by a user on YugabyteDB Community Slack ]

I’m looking for help speeding up writes to an timestamp based index. We’re using a table like the following:

CREATE TABLE demo (id UUID PRIMARY KEY, ts TIMESTAMP NOT NULL);
CREATE INDEX demo_ts_idx ON demo (ts ASC);

This causes us to have all of our writes go to the same tablet and performance takes a big hit.

A way to spread the writes into different tablets, is to implement app-level sharding by adding a ts_shard INT column which will help to spread the writes & reads. We hash the primary key into, say, 16-32 virtual partitions, and store the number on this column to use it as the partitioning key of the index.

When reading, we also have to query for all partitions and the db will gather+merge+return rows.

See example below:

CREATE TABLE demo (id UUID PRIMARY KEY, ts TIMESTAMP NOT NULL, ts_shard INT NOT NULL);
CREATE INDEX demo_ts_idx ON demo (ts_shard, ts ASC);

yugabyte=# EXPLAIN SELECT * FROM demo WHERE ts_shard IN(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) AND ts < now();
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Index Scan using demo_ts_idx on demo  (cost=0.00..5.32 rows=10 width=28)
   Index Cond: ((ts_shard = ANY ('{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}'::integer[])) AND (ts < now()))
(2 rows)

Currently we miss an optimization if we want to use OR when querying the ts_shard values:

-- not optimized

yugabyte=# EXPLAIN SELECT * FROM demo WHERE ts_shard BETWEEN 0 and 16 AND ts < now();
                            QUERY PLAN
-------------------------------------------------------------------
 Foreign Scan on demo  (cost=0.00..110.00 rows=1000 width=28)
   Filter: ((ts_shard >= 0) AND (ts_shard <= 16) AND (ts < now()))
(2 rows)

We are tracking that issue here: https://github.com/yugabyte/yugabyte-db/issues/3131