Optimizing Query Performance in YugabyteDB!

Hey everyone,

I am working on optimizing queries in YugabyteDB and I would love to get some insights from the community. I have a dataset with millions of rows…, and I have noticed that some queries take longer than expected…, even with proper indexing.

Query Execution Plans – What’s the best way to analyze and optimize execution plans in YugabyteDB: ?? Are there any tools or strategies you recommend: ??
Sharding Impact – How does the default sharding strategy impact query performance: ?? Would manually defining table partitions help in my case: ??
Joins vs. Denormalization – For high-performance reads, is it better to normalize and use joins, or should I consider some level of denormalization: ??
Caching Mechanisms – Does YugabyteDB offer any built-in caching mechanisms, or should I rely on external caching solutions like Redis: ??

Any best practices or real-world experiences would be really helpful !! Looking forward to your expert advice. I have also read this thread https://docs.yugabyte.com/preview/explore/query-1-performance/pg-hint-plan-servicenow but couldn’t get enough solution.

Thanks in advance !!

With Regards,
Marcelo Salas

Hi @marcelosalas00

As a general answer see best practices: Best practices for YSQL applications | YugabyteDB Docs

Simple explain analyze, then looking at Query Tuning | YugabyteDB Docs

It has a big impact.

Partitions usually help in time-series data where you want to drop old partitions.

Depends on the exact case.

Depends on the exact case.

Please describe one or each workload in detail:

  1. table/indexes schema
  2. queries + explain analyze
  3. What the workload is like? How many reads/writes? How big/small are the batches etc.
  4. node/cluster hardware (vcpu,memory,network,disk)

Then I’ll make recommendations and explaining why.

Hi @marcelosalas00 please share the execution plan taken with explain (analyze, dist) as it will give all info. The sharding impact will be visible in the number of read requests.
Here is an example:

Note that to get the query planner find the best plan you need to use the cost based optimizer, which means:

  • ANALYZE the tables (there’s no auto-analyze yet)
  • have yb_enable_base_scans_cost_model set to on (it’s the default only when you start with postgres parity enabled)