16 nodes in a region
Application in central region is configured to talk to master(leader) in central region
Application is using YCL driver(gocql), so the table is currently hash based partitioned
Given the requirement with huge traffic of SELECT query(below) and less traffic of INSERT & UPDATE queries:
SELECT userID, col3, col4, col5, col6, col7
FROM table1
WHERE domainID = :domain_id AND userID IN :user_id_set
Database currently has schema(below) with hash based partition:
CREATE TABLE keySpaceName.table1 (
domainID VARCHAR,
userID VARCHAR,
col3 VARCHAR,
col4 VARCHAR,
col5 VARCHAR,
col6 VARCHAR,
col7 VARCHAR,
PRIMARY KEY((userID, domainID), col3, col4));
Note: One domain can have multiple users. Thousands of users in a domain.
For some historical reasons, our schema was badly partitioned using PRIMARY KEY((userID, domainID), col3, col4))
, which lead to SELECT queries getting huge timeouts with 16 node cluster(hosting hash based partitioned table1
). Need to regularly restart application.
So, refactoring PRIMARY KEY((userID, domainID), col3, col4))
to PRIMARY KEY((domainID), userID) WITH CLUSTERING ORDER BY (userID ASC);
. Table gets created with hash partitioning using key domainID
. Clustering key userID
would be ordered ascending.
Partition key domainID
would enhance performance of search domainID = :domain_id
and
clustering key userID
would enhance performance of search userID IN :user_id_set
-
Please let me know, if this refactoring make sense, to enhance WHERE clause performance in SELECT query
-
How do INSERT query(to add a new row) & UPDATE query(to update an existing row) perform on this refactoring?