How to find the hash ranges for each tablet server?

We have 6 node cluster, yugabyte 2.4.4 version

4 nodes in central region
2 nodes in east region

So, this implies 4 tablet servers in central region.

We have a single large YSQL table on yugabyte DB


CREATE TABLE sample (
    supplier_id INT,
    item_id INT,
    supplier_name TEXT STATIC,
    item_name TEXT,
    PRIMARY KEY((supplier_id, item_id) HASH)
);

My understanding is, hash(yb_hash_code()) range for each tablet server will be decided based on number of tablet servers

  1. How to know the hash range(number) for each tablet server in central region?

For a YSQL table(shown above), goal is to fetch all rows from each tablet parallely based on yb_hash_code() >= low and yb_hash_code() < high
How does the select query look like? to fetch all rows(from each tablet server)

@sham_yuga -

So in the case where you have 2 nodes in one of the central region, 2 nodes in the other, and 2 nodes in the east region, you have 3 complete copies of your data. If your tablet leaders are concentrated/pinned to the central region, then there is no need to worry about the location.

The simplest way to parallelize this is to do the following calculation:

65535/number_of_parallel_threads

In this case, 65535 represents the number of hash values possible in the system. So if we used 16 threads, then each range would be 4095 values. So your query would be:

1st thread:

SELECT * FROM sample WHERE yb_hash_code(supplier_id, item_id)>=0 AND yb_hash_code(supplier_id, item_id) < 4095);

2nd thread:

SELECT * FROM sample WHERE yb_hash_code(supplier_id, item_id)>=4095 AND yb_hash_code(supplier_id, item_id)<8190;

Continue until you reach 65535.

Hope this helps.
Alan

1 Like

@Alan_Caldera

In my setup, we have 4 tablet servers in central region. Leader is pinned to central region

My understanding is,

Tablet 1 has hash range [0,16384)
Tablet 2 has hash range [16384, 32768)
Tablet 3 has hash range [32768, 49152)
Tablet 4 has hash range [49152, 65535]

if my understanding(above) is correct, then I would launch 4 threads(of select query), because each query runs on only one tablet, based on the hash ranges(four chunks), Isn’t it?

Similarly, If there are 16 tablets in central region, then, as per this documentation

Tablet 1 has range [0x0000, 0x1000),
Tablet 2 has range [0x1000, 0x2000),
:
:
Tablet 16 has range [0xF000, 0xFFFF]

Correct me

@sham_yuga - That’s correct.

Alan

@Alan_Caldera

OK… I think concept of hash range distribution across tablets is same, irrespective of using YCQL or YSQL on yugabyteDB

why hash ranges in this querylink looking different than below range?

In the querylink, I was expecting four rows only in system.partitions table, because there are four tablet servers in central region, as shown below

Tablet 1 hash range [0,16384)
Tablet 2 hash range [16384, 32768)
Tablet 3 hash range [32768, 49152)
Tablet 4 hash range [49152, 65535]

@sham_yuga

You have 24 tablets because you have 6 nodes total. It looks like your system is set up to create 4 tablets per tablet-server. You would only see that kind of display if you had 6 nodes and you specified WITH TABLETS = 4in the CREATE TABLE syntax. See this link in our documentation for a longer explanation: SpecifyTablesTableCreateTime

Alan

1 Like

@Alan_Caldera

I think they are 48 tablets but not 24
Because there are 48 rows in system.partitions table
Correct me