Why are there already 8 tablets in the wals sub-directory under the data directory after a database cluster is initialized?

ZhenNan2016 · July 19, 2024, 11:55am

Hi
I have a question for you: after a database cluster is initialized, there are already 8 tablets in the wals sub-directory under the data directory even though I haven’t created any tables, and it looks like these tablets belong to the system level. What are these tablets for?

dorian_yugabyte · July 19, 2024, 12:13pm

They are system tables. For “transactions” table (needed for distributed transactions), “metrics”, “sequences”, pg_cron soon. etc

ZhenNan2016 · July 19, 2024, 1:06pm

I got it now, thanks.
I have another question to ask here.

Are all the columns of a table stored together? When querying, if I specify that I only need a specify column, does it need to query all the columns?
e.g.
Execute the following sql statement:
CREATE TABLE hash (k1 int, k2 int, k3 int, v1 int, v2 int, PRIMARY KEY ((k1, k2) hash, k3 asc));
INSERT INTO hash VALUES (1, 2, 3, 4, 5), (6, 7, 8, 9, 10);

The data is as follows:
(hash1, [1,2], 3), system_column_id, T1 → [NULL]
(hash1, [1,2], 3), v1_column_id, T1 → 4
(hash1, [1,2], 3), v2_column_id, T1 → 5

(hash1, [6,7], 8), system_column_id, T1 → [NULL]
(hash1, [6,7], 8), v1_column_id, T1 → 9
(hash1, [6,7], 8), v2_column_id, T1 → 10

I only need the data from the v2 column, when I query it, will it query v1 with it and filter out v2?
If not, can I do this while storing:
Store v2 separately, but at the user level, use it in a non-perceptive, transparent way. The reason for this is that sometimes I need to pull the data from column V2 separately, without querying out the other columns together and filtering them.
The final storage model is probably like this:
(hash1, [1,2], 3), system_column_id, T1 → [NULL]
(hash1, [1,2], 3), v1_column_id, T1 → 4
(specific_col_hash1, [1,2], 3), v2_column_id, T1 → 5

(hash1, [6,7], 8), system_column_id, T1 → [NULL]
(hash1, [6,7], 8), v1_column_id, T1 → 9
(specific_col_hash1, [6,7], 8), v2_column_id, T1 → 10
Is there any problem if you store it like this? Is it possible to query specific_col_hash1 out independently when querying?
Thanks a lot.

dorian_yugabyte · July 19, 2024, 1:22pm

Yes. For more details read:

They are stored together, and cached in blocks together. But they will be filtered in docdb/rocksdb.

You need to store them in separate tables. This may be an unneeded optimization though.

ZhenNan2016 · July 19, 2024, 1:48pm

Why is optimization not necessary? I’d like to hear your suggestions, thanks a lot.
If store them in separate tables, then every time you query, you may have to do join operation on both tables, right?
That said, according to the current storage model, all the columns are stored together, and cached in blocks together,but they will be filtered in docdb/rocksdb. So compared to the way to split a column out and store it separately, the query performance will be little worse?
That is to say, if like column storage, I only need to query the column of the file stored independently can be, with other columns will have nothing to do, then maybe the cache does not need other columns?

dorian_yugabyte · July 19, 2024, 1:54pm

Because usually there are different things that need optimization before this (in a usual APP).
And if it’s needed (after measuring), than you can create a separate table for the heavy column(s).

That or separate queries.

Depends on what you exactly mean by “splitting”. Having a separate table also has it’s overheads, so again, it depends on the exact case. A benchmark can be done to support both cases by changing the frequency of accessing or modifying the “big” column(s).

Yes.
Columnar engines (storing each column separately) are good for OLAP but not generally for OLTP.
In OLTP most often you need the full row, and small selects of few rows, which this type of storage is better.

ZhenNan2016 · July 19, 2024, 2:01pm

Yes.
If it’s a vector column, for vector ANN scenarios, then it’s actually OLAP, so is it better to store it independently?

dorian_yugabyte · July 19, 2024, 2:08pm

I need to know the query patterns to answer. If it’s large scans and queried mostly separately then maybe yes.

If it’s using an index, then maybe the index is acting as a separate table.

FranckPachot · July 19, 2024, 2:26pm

@ZhenNan2016 When there’s a need to improve performance when getting a small subset of columns for a large row table, an index that covers only the required columns may help as it doesn’t have to read the table (Index Only Scan)

Alan_Caldera · July 19, 2024, 2:34pm

Zhen Nan -

If you turn off packed rows via the GFlag (ysql_enable_packed_row) , DocDB reverts to storing each column as a separate record in DocDB as the column number is part of the DocKey. We built the packed rows feature as an optimization to reduce the number of seeks and puts when reading and writing to the database. In fact, JSONB in packed_rows continues to be separate from the non-JSONB columns. When writes (updates) are done to the packed row, we actually write the columns separately in DocDB (unless all non-key columns are updated) and they are later “re-packed” during compaction.
So as Dorian points out, if you want to limit the amount of data retrieved and have it “separated” when making a read, then you could use a covering index (one that uses INCLUDE columns) to read that specific column as an IndexOnlyRead.

Hope this helps.
–Alan

ZhenNan2016 · July 22, 2024, 3:20am

Hi @Alan_Caldera
When reading data, if you only need a certain column, is it possible to read the desired column individually from DocDB? It don’t need to read all the columns, right?
By the way, do we support partial fields for packed rows?

ZhenNan2016 · July 22, 2024, 3:23am

Also, if I store it according to the model shown below:
(hash1, [1,2], 3), system_column_id, T1 → [NULL]
(hash1, [1,2], 3), v1_column_id, T1 → 4
(specific_col_hash1, [1,2], 3), v2_column_id, T1 → 5

(hash1, [6,7], 8), system_column_id, T1 → [NULL]
(hash1, [6,7], 8), v1_column_id, T1 → 9
(specific_col_hash1, [6,7], 8), v2_column_id, T1 → 10
So, the data with key “specific_col_hash1” will eventually be “re-packed” when compaction?

ZhenNan2016 · July 22, 2024, 3:27am

Hi @FranckPachot
Yes, I think the solution is great. However, my leaders would like to implement it in a different way.

ZhenNan2016 · July 22, 2024, 6:50am

Hi
What does the actual data storage model look like after packed row?
Is there any demo of the actual data?
Is the data model like the one below the one after packed?
e.g.
(hash1, [1,2], 3), system_column_id, T1 → [NULL]
(hash1, [1,2], 3), v1_column_id, T1 → 4
(hash1, [1,2], 3), v2_column_id, T1 → 5
packed row: <(hash1, [1,2], 3), packed { [NULL]，4, 5}> ， right ？
(hash1, [6,7], 8), system_column_id, T1 → [NULL]
(hash1, [6,7], 8), v1_column_id, T1 → 9
(hash1, [6,7], 8), v2_column_id, T1 → 10
packed row：<(hash1, [6,7], 8), packed { [NULL]，9, 10}> ， right ？

dorian_yugabyte · July 22, 2024, 7:34am

Data is stored as blocks on rocksdb/docdb. So it will read/decompress the whole “block” from disk. And only return with RPC the needed column.

Yes, updates to packed rows are merged into the packed rows on compaction.

Did you read Packed rows in DocDB | YugabyteDB Docs ?

Inserts: Entire row is stored as a single key-value pair.

Yes.

The covered index scenario doesn’t work on the case of “big column rarely accessed” because you will duplicate data (and it’s a big column).

ZhenNan2016 · July 22, 2024, 8:22am

Hi
Thanks for your reply.
By the way, do we support partial fields for packed rows?

dorian_yugabyte · July 22, 2024, 8:38am

What do you mean by partial fields?

ZhenNan2016 · July 22, 2024, 8:48am

For example, there are four columns A, B, C, and D. I only do packed for A, B, and C, and column D is stored separately.
e.g.
(hash1, [6,7], 8), system_column_id, T1 → [NULL]
(hash1, [6,7], 8), v1_column_id, T1 → 9
(hash1, [6,7], 8), v2_column_id, T1 → 10

packed row: <(hash1, [6,7], 8), packed { [NULL]，9}>
and non-packed: (hash1, [6,7], 8), v2_column_id, T1 → 10
Thanks a lot.

ZhenNan2016 · July 22, 2024, 9:00am

Updates: If some column(s) are updated, then each such column update is stored as a key-value pair in DocDB (same as without packed rows). However, if all non-key columns are updated, then the row is stored in the packed format as one single key-value pair. This scheme adopts both efficient updates and efficient storage.

Does this mean that if there are four columns A, B, C, D. And columns B, C, and D are non-primary key columns, and if only column B is updated, then the row is not stored in the packed format as one single key-value pair?

dorian_yugabyte · July 22, 2024, 9:25am

The original row (INSERT) is packed, you have 1kv.

Case 1:Updating ONLY column B,it is another kv. So now you have 2 kv.
Case 2:Updating columns B,C, you insert 2kv. Now you have 3kv.
Case 3:Updating all columns B,C,D, you insert 1kv. Now you have 2kv.

On compaction, on all cases, they merge into 1 kv.

Topic		Replies	Views
Database schema in YugabyteDB for storing statistics collected periodically General	50	1920	November 21, 2022
Facing strange behaviour while checking ingestion rate in ysql table General	11	588	September 6, 2023
How to shard tables for efficient join Design Discussions	16	382	October 2, 2024
Kv store policy General	2	30	January 10, 2025
ERROR: [Snapshot too old (yb/tablet/tablet_retention_policy.cc:162): Snapshot too old. Read point: General	3	92	July 3, 2024

Why are there already 8 tablets in the wals sub-directory under the data directory after a database cluster is initialized?

Related topics