YSQL: Transaction errors and slow performance during execution of DML statements


I am running YugabyteDB on 3 node setup, where on each node 1 yb-master and 1 yb-tserver is running. On one of the nodes I create a database via createdb utility and then I run an SQL script containing only DDL statements (e.g., CREATE TABLE, CREATE VIEW, CREATE FUNCTION, etc.) via ysqlsh utility:

ysqlsh -v ON_ERROR_STOP=1 -f my_ddl.sql my_database

Sometimes this script runs successfully to completion, but most of the times it fails with an error related to transactions, such as the following:

  • ERROR: Operation failed. Try again: Unknown transaction, could be recently aborted: 9a3ac51d-494f-42e9-bd5d-48074d401ae7
  • ERROR: Operation expired: Heartbeat: Transaction de8093bb-7b36-4317-82d8-9675665f4ba4 expired or aborted by a conflict: 40001

At the time the above script is running there are no other connections to the database.

Is there anything we can do in our setup or in our script to eliminate such errors? We need to find a way to get this script running reliably, as it is intended to be executed by an automated build process.

Also, is there anything that we can do to speed up the execution of this script? I noticed that it takes about 4-5 seconds to create a table or an index, unless the table primary key or the index uses range partitioning (i.e., not hash), in which case it takes about 1 second. In the case where the script successfully finishes, this means that it takes about a half hour (30 minutes) to complete (it creates 246 tables and 138 indices). We are certainly not expecting the same time required for executing such operations in PostgreSQL (that is on the order of several seconds), but we would appreciate any guidance for improving the execution time. (For example, do we need to start up yb-master and yb-tserver processes with certain flags?)

The yb-master was started on each node using a command of the following form:

nohup yb-master --master_addresses IP1:7100,IP2:7100,IP3:7100 \
    --rpc_bind_addresses LOCAL_NODE_IP:7100 \
    --fs_data_dirs /data/vlst/yugabyte/yugabyte- \
    --max_log_size 100 &> yb-master.out &

and the yb-tserver was brought up on each node with a command of this format:

nohup yb-tserver --tserver_master_addrs IP1:7100,IP2:7100,IP3:7100 \
    –rpc_bind_addresses LOCAL_NODE_IP:9100 \
    –fs_data_dirs /data/vlst/yugabyte/yugabyte- \
    --max_log_size 100 \
    --start_pgsql_proxy \
    --pgsql_proxy_bind_address LOCAL_NODE_IP:5433 \
    --ysql_log_statement all \
    --ysql_timezone LOCAL_TIMEZONE \
    --pg_yb_session_timeout_ms 900000 \
    --cql_proxy_bind_address LOCAL_NODE_IP:9042 \
    --cql_rpc_keepalive_time_ms 0 \
    --ysql_client_read_write_timeout_ms 300000 \
    --yb_client_admin_operation_timeout_sec 300 &> yb-tserver.out &

Your input is greatly appreciated.
Thank you

In an attempt to see if reducing the amount of information logged by logging only at ERROR level and above would improve performance, I restarted the yb-master and yb-tserver processes using “–minloglevel 2” and omitting the “–ysql_log_statement all” flag. I was able to execute our DDL script successfully (although there was no speed improvement with the logging level change). However, several hours later, and with all 3 nodes being idle from the time the DDL script had completed, the yb-tserver process on one of the nodes stopped running. The yb-tserver had logged the following on that node:

E0822 19:55:26.955437 1720402 consensus_queue.cc:688] T 606c5a13cfc64203802aa21404908626 P b1ea77e62e5e44649e2b7759da870048 [LEADER]: Error trying to read ahead of the log while preparing peer request: Incomplete (yb/consensus/log_cache.cc:268): Op with index 31 is ahead of the local log (next sequential op: 31). Destination peer: 8d09c8f5de2b4c788bfb5532924d1f06

I also noticed errors of the following form being logged by the yb-master on that node about a couple of hours before the yb-tserver error:

E0822 18:01:21.021014 938939 catalog_loaders.cc:151] Unexpected Tablet state for 0052fdbbde24411d99b92507dd527d39: REPLACED, unknown table for this tablet: 00004ddb0000300080000000000054d6

There were 90 errors of this form altogether.

Is there anything we can do (e.g., in our setup) to avoid such situation where yb-tserver stops running on a node?

Your input is greatly appreciated.

Thank you

Hi, I see that you had no answer on this. The way you start yb-master and yb-tserver looks good. The errors you show. However, 246 tables and 138 indices is a lot with the default settings which will split them into 8 tablets per servers (see here about the defautl splitting). As you have only 3 servers, this means with replication factor 3 that each server will have 384 tablet peers. This requires large instances.
What you can do is enable auto-splitting so that only one tablet is created, and will be split according to the size. More about sizing: Distributed SQL Sharding: How Many Tablets, and at What Size? (yugabyte.com)

Thank you very much for your reply. I will enable auto-splitting as you suggest, to see if the “Error trying to read ahead of the log while preparing peer request” errors that we still get will be eliminated.

It is a general recommendation (not having too many tablets). But I shouldn’t expect this kind of errors which may be caused by other problems. Please check your NTP or Chrony and be sure clock is ok

Hello Franck,

This turned out to be the crucial point I was missing. Once I followed your advice, everything seems to be working smoothly: we no longer have the errors, the database DDL script now takes about 15 minutes to complete instead of the 30 minutes it took beforehand, and the CPU and memory usage has greatly dropped on each node VM. Once again, thank you very much, I really appreciate it!

Efstathios Vlachogiannis

1 Like