I would like to understand, how the yugabyte stores data on the low level. As I understand, the yugabyte is built on the DocDB and RocksDB. So at the end, it stores data as key value.In short the key is built based on the primary key columns (doc).
The value is stored as document with subkeys and corresponding values.
Do I get it right that yugabyte stores it as column per cell ? So each column is separate key,value.
When I have table:
CREATE TABLE sample
(
id INT NOT NULL,
name INT NOT NULL,
doc jsonb NOT NULL,
creation_date timestamp,
PRIMARY KEY (id)
);
One record will result into following key values:
DocumentKey1, T10 -> {} // This is an init marker
DocumentKey1, name, T10 -> Value1
DocumentKey1, doc, T10 -> Value2
DocumentKey1, creation_date, T10 -> Value3
As I understand, it is not possible to define column families on my own. Correct?
Hi, yes, the idea that there is one key for each column value was true until we implemented packed rows. Now, when you insert, each SQL row corresponds to a single key-value pair in RocksDB. When you update a column, it creates an additional key-value pair because the version timestamp is included in the key, along with the new column value. After the compaction, these are repacked into a single key-value per row.
Hi, thanks for the explanation, it is clear right now. And I’ve just ran into this page Packed rows in DocDB | YugabyteDB Docs where it is clearly described.