On-disk size differences between jsonb structures

I try to experiment with different types of models to find the best balance between performance and maintenance. So I started to play around with jsonb. I try to simulate storing measurement from sensors.

I have tested two schemas:

CREATE TABLE yuga_json_1
(
    country               INT    NOT NULL,
    province             INT    NOT NULL,
    id                      BIGINT NOT NULL,
    yearmonth       INT    NOT NULL,
    measurement        jsonb  NOT NULL,
    creation_date timestamp,
    CONSTRAINT pk PRIMARY KEY (country ASC, province ASC, id ASC, yearmonth ASC)
)

where measurement structure looks as follows:

{
  "<day_of_month": [
    {
      "p": 12,
      "q": 2,
      "value": 924.08
    },
collection of various measurements...

For this structure, I have stored around 291463 records with 108911008 measurements in total.

And then I have tried with more records but less nested jsonb structure.

CREATE TABLE yuga_json_2
(
    country               INT    NOT NULL,
    province             INT    NOT NULL,
    id                      BIGINT NOT NULL,
    yearmonthday       INT    NOT NULL,
    measurement        jsonb  NOT NULL,
    creation_date timestamp,
    CONSTRAINT pk PRIMARY KEY (country ASC, province ASC, id ASC, yearmonthday ASC)
)

where measurement structure looks as follows:

 [
    {
      "p": 12,
      "q": 2,
      "value": 924.08
    },
collection of various measurements...

For this structure, I have stored around 8452427 records with the same number of measurements in total.

What is the source of such a differences ? Do I measure the space needed for storing those data correctly ?

Yugi -

Thanks for your question. There is a fairly simple explanation for the behavior.

When we first introduced JSONB into YugabyteDB, we wrote each attribute as a separate row in DocDB. So if you had a table with 50 columns, we would write 50 “rows” into DocDB. This led to a fair amount of write amplification in the database, and poorer performance (for reads and writes) as the number of columns increased. The advantage of this format was the cost of writing and querying a single column.
A couple of years ago, we introduced a new storage feature called “packed rows” (https://docs.yugabyte.com/stable/architecture/docdb/packed-rows/#overview). When “packed rows” is enabled (starting in 2.20 and later versions), the entire row is represented as a single key-value pair inside of DocDB. The limit on the size of the row that we can “pack” is a single SSTBlock which is 32KB. Anything beyond 32KB will be written in the traditional storage format, where each attribute of the JSONB will be represented as a separate key-value in DocDB.
So, that is where the write amplification that you are noting is coming from.
Using the second schema where the row sizes are smaller (less than an SSTBlock) results in a much more compact format.

Hope this helps.
–Alan

1 Like

Hi Yugi,

Would you share the script you used to generate the data? I tried reproducing the issue you encountered but both tables are the same size for me. Here is a link the script I used: https://gist.github.com/kai-franz/1a70655617e075aa0b3ca40a52684077

Thanks,
Kai

1 Like

Same here, I can’t reproduce it and would expect similar size. What @Alan_Caldera described would be the case with many JSON columns, not with one large.

I will try to check it one more time tomorrow and hopefully provide some simple script to reproduce it.

As I’ve looked on your script, my approach is a little bit different. I do insert day by day for monthly records. I use Insert on conflict to append next day to existing json, when it’s already exists.

Aah… so in the first case, were you are storing a month’s worth of measurements in a single JSONB, you are updating the same row/JSONB record 30 times (once for each day).

  • This approach won’t be very run-time efficient since you are doing read-modify-write of an increasing size JSONB each time. If each day’s JSONB is about x bytes, then you are writing ever increasing size of JSONB each time: x + 2x + 3x + … + 30x = 465x bytes; instead of just 30x bytes (when storing it for each day separately).

  • Also, now that it is an update heavy workload (instead of insert heavy workload), and YugabyteDB (like most other databases) keeps multiple versions of a row around for sometime (both because it uses a Log Structured Merge Tree based storage and because it does multi-version concurrency control for avoiding read-locks) – the older versions aren’t immediately reclaimed (from a space perspective).

  • However, in order to compare apples to apples, you can force a manual compact_table to reclaim space of older versions of the row.

./yb-admin --master_addresses <ip1:7100>,<ip2:7100>,<ip3:7100> compact_table ysql.<db_name> <table_name>

Can you give this command a try and confirm if the space utilized by the two approaches is similar?

1 Like

You nailed it perfectly.

This command helps to reduce size significantly. The only thing which confuses me, is the way how it works in time.

Initial state

I executed few times following command:
./yb-admin --master_addresses 172.17.0.2:7100 compact_table ysql.yugabyte yuga_json_1
After each execution, I waited few minutes. And after another few minutes, finally I noticed that WAL Files size started to decrease.

The state after WAL Files decreased:

And after another few minutes and executed compact command, I noticed that SST files size decreased.

Final state:

Comparing to the on-disk size after loading jsons without update.

I tested it several times and sometimes it worked instantly and sometimes I had to wait quite a long time.

Could you elaborate on how the compact command works ? Is there any place that I can monitor the progress? Should I run it once and just wait?

@yugi:

See this TServer control knob timestamp-history-retention-interval-sec whose default is 15 minutes.

So if you are trying to compact too soon after data load, the space of older versions will not be reclaimed because YugabyteDB allows queries (e.g., a long-running query) that’s reading the database as of an older point in time.

1 Like

Thanks for explanation! Everything is clear now.