I am trying to find aggregates by joining couple of smaller tables which are 100k records each. I have necessary indexes created on the columns which i am trying to query, but execution time of the query seems to be very odd. Please find the table definitions and query execution plan as well, i have tried with queryhint of hash join as well but no help.
Can you please help here?
staging=# \d users;
Table "public.users"
Column | Type | Collation | Nullable | Default
-------------------+------------------------------+-----------+----------+------------------------------------------
id | uuid | | not null |
email | character varying(255) | | not null |
phone | character varying(255) | | |
password | character varying(255) | | not null |
registered_from | character varying(255) | | not null |
status | enum_users_status | | | 'inactive'::enum_users_status
is_email_verified | enum_users_is_email_verified | | | 'inactive'::enum_users_is_email_verified
is_phone_verified | enum_users_is_phone_verified | | | 'inactive'::enum_users_is_phone_verified
is_deleted | timestamp with time zone | | |
created_at | timestamp with time zone | | | now()
updated_at | timestamp with time zone | | | now()
status_changed_at | timestamp with time zone | | |
status_changed_by | uuid | | |
Indexes:
"users_pkey" PRIMARY KEY, lsm (id HASH)
"users_email" UNIQUE, lsm (email HASH)
"users_phone" UNIQUE, lsm (phone HASH)
"ix_users_email" lsm (email HASH) WHERE status = 'active'::enum_users_status AND is_deleted IS NULL
"user_status" lsm (status HASH, id ASC)
staging=# \d profiles;
Table "public.profiles"
Column | Type | Collation | Nullable | Default
---------------+--------------------------+-----------+----------+----------------------------------------------------------------------
user_id | uuid | | not null |
first_name | character varying(255) | | not null |
middle_name | character varying(255) | | |
last_name | character varying(255) | | not null |
date_of_birth | timestamp with time zone | | |
gender | enum_profiles_gender | | |
address_info | json | | | '{"street":"","city":"","state":"","country":"","zipcode":""}'::json
country_code | character varying(255) | | not null |
created_at | timestamp with time zone | | | now()
updated_at | timestamp with time zone | | | now()
picture | integer | | |
Indexes:
"profiles_pkey" PRIMARY KEY, lsm (user_id HASH)
staging=# select count(1) from users;
count
-------
99552
(1 row)
staging=# select count(1) from profiles;
count
-------
99552
(1 row)
staging=# /*+HashJoin("profile","user")*/
staging-# EXPLAIN ANALYZE SELECT count(1) AS "count" FROM "users" AS "user" INNER JOIN "profiles" AS "profile" on "profile"."user_id" = "user"."id" where "user"."status" = 'active';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=30.50..30.51 rows=1 width=8) (actual time=78266.507..78266.507 rows=1 loops=1)
-> Nested Loop (cost=0.00..30.25 rows=100 width=0) (actual time=4.454..78204.609 rows=99535 loops=1)
-> Index Only Scan using user_status on users "user" (cost=0.00..15.25 rows=100 width=16) (actual time=3.390..154.580 rows=99535 loops=1)
Index Cond: (status = 'active'::enum_users_status)
Heap Fetches: 0
-> Index Scan using profiles_pkey on profiles profile (cost=0.00..0.15 rows=1 width=16) (actual time=0.772..0.772 rows=1 loops=99535)
Index Cond: (user_id = "user".id)
Planning Time: 0.217 ms
Execution Time: 78266.585 ms
Peak Memory Usage: 18014398509468096 kB
(10 rows)
staging=# DROP INDEX user_status;
DROP INDEX
staging=# EXPLAIN ANALYZE SELECT count(1) AS "count" FROM "users" AS "user" INNER JOIN "profiles" AS "profile" on "profile"."user_id" = "user"."id" where "user"."status" = 'active';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=218.89..218.90 rows=1 width=8) (actual time=76854.824..76854.824 rows=1 loops=1)
-> Nested Loop (cost=0.00..216.39 rows=1000 width=0) (actual time=6.308..76794.314 rows=99535 loops=1)
-> Seq Scan on users "user" (cost=0.00..102.50 rows=1000 width=16) (actual time=5.088..142.942 rows=99535 loops=1)
Filter: (status = 'active'::enum_users_status)
Rows Removed by Filter: 17
-> Index Scan using profiles_pkey on profiles profile (cost=0.00..0.11 rows=1 width=16) (actual time=0.758..0.758 rows=1 loops=99535)
Index Cond: (user_id = "user".id)
Planning Time: 22.782 ms
Execution Time: 76855.960 ms
Peak Memory Usage: 18014398509468840 kB
(10 rows)