Hi @ivan_s,
It might be easier to get on a call to discuss your application, I am trying to piece the details together but it seems a little hard. But here’s what I understand so far.
You have a chat application, which has:
- users
- Lets say these are u1, u2, …
- There can be many users (millions)
- categories
- Each user can chat with other users in multiple categories. As an example, users u1 and u2 can have separate chat histories in categories c1 and c2.
- There are a handful of categories, say 100 or so.
- messages
- Each chat in a category will have a list of messages. For example, (user u1, category c1) can have a chat history for (u1-u2) which (user u1, category c2) can have a separate chat history for (u1-u2)
- Each chat history and each category would have an unread count for a user, which you are trying to make efficient.
If the above is correct, here is a proposal. Note that the following proposal uses hash and range sharding features of YugabyteDB, which helps you achieve scale and performance respectively.
Schema
Let’s go with 2 tables:
user_category_overview
which contains the unread message counts for each user per category
user_category_messages
which contains the list of messages, etc for each user per category
CREATE TABLE user_category_overview (
user_id UUID,
category_id INT,
unread_count INT DEFAULT 0,
PRIMARY KEY (user_id HASH, category_id ASC)
);
CREATE TABLE user_category_messages (
user_id UUID,
category_id INT,
message_id BIGINT, # use epoch time as the message id
is_unread INT DEFAULT 1, # 0 if message is read and 1 for unread
... # message_body, etc.
PRIMARY KEY (user_id HASH, category_id ASC, message_id DESC)
);
Note that the tables above have one column (user_id
) hash sharded for scalability purposes, while the other columns are sorted for ordered retrieval which brings the optimal balance between scale and perf.
Inserts
Inserting a new message into this table, you would insert a row into two tables:
BEGIN TRANSACTION
UPDATE user_category_overview
SET unread_count = unread_count + 1
WHERE user_id = '...' AND category_id = '...';
INSERT INTO user_category_messages (...) VALUES (...);
COMMIT
If a message is read, you would perform the following:
BEGIN TRANSACTION
UPDATE user_category_overview
SET unread_count = unread_count - 1
WHERE user_id = '...' AND category_id = '...';
UPDATE user_category_messages
SET is_unread = 0
WHERE user_id = '...' AND category_id = '...';
COMMIT
If messages can be deleted when unread in your application, you would have to perform that as a two table transaction when deleting an unread message.
Selects
For unread count per category simply do:
SELECT category_id, unread_count FROM user_category_overview
WHERE user_id = '...';
The above will fetch a small number of rows (only the categories where the user has messages).
For total unread counts, you should add the above in the client side, or you can also do:
SELECT SUM(unread_count) FROM user_category_overview
WHERE user_id = '...';
Which API to use
Note that all of the above is possible in both the YSQL and the YCQL API. Pick the API depending on your needs (app needs like joins, reading from nearest datacenter/region, latency you want, etc). This section will help you decide: FAQs about YugabyteDB API compatibility | YugabyteDB Docs. When in doubt, please pick YSQL.
Happy to talk more if you need help.