I have a 3 node yugabyte cluster setup on AWS with the following characteristics:
- 3
c4.xlarge
nodes - 5 TB EBS volumes attached to each one of them
- Replication factor is 3
While doing some write intensive performance testing, I noticed that one of the tserver nodes went down. The log files indicate the following:
F0313 23:58:30.174280 19958 ref_cnt_buffer.cc:30] Check failed: data_ != nullptr
Fatal failure details written to /home/ec2-user/yugabyte-db/data/disk0/yb-data/tserver/logs/yb-tserver.FATAL.details.2019-03-13T23_58_30.pid19726.txt
F20190313 23:58:30 ../../src/yb/util/ref_cnt_buffer.cc:30] Check failed: data_ != nullptr
@ 0x7fb8085c7db3 yb::LogFatalHandlerSink::send(int, char const*, char const*, int, tm const*, char const*, unsigned long) (src/yb/util/logging.cc:474)
@ 0x7fb805cf1c05
@ 0x7fb805cef439
@ 0x7fb805cf22ae
@ 0x7fb80862089b yb::RefCntBuffer::RefCntBuffer(unsigned long) (src/yb/util/ref_cnt_buffer.cc:30)
@ 0x7fb80a601f33 yb::rpc::serialization::SerializeHeader(google::protobuf::MessageLite const&, unsigned long, yb::RefCntBuffer*, unsigned long, unsigned long*) (src/yb/rpc/serialization.cc:116)
@ 0x7fb80a5cd566 yb::rpc::OutboundCall::SetRequestParam(google::protobuf::Message const&) (src/yb/rpc/outbound_call.cc:244)
@ 0x7fb80a5d697b yb::rpc::Proxy::AsyncRequest(yb::rpc::RemoteMethod const*, google::protobuf::Message const&, google::protobuf::Message*, yb::rpc::RpcController*, std::function<void ()>) (src/yb/rpc/proxy.cc:125)
@ 0x7fb80cb66b60 yb::consensus::ConsensusServiceProxy::UpdateConsensusAsync(yb::consensus::ConsensusRequestPB const&, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcController*, std::function<void ()>) (src/yb/consensus/consensus.proxy.cc:32)
@ 0x7fb80ef07ae6 yb::consensus::RpcPeerProxy::UpdateAsync(yb::consensus::ConsensusRequestPB const*, yb::consensus::RequestTriggerMode, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcController*, std::function<void ()> const&) (src/yb/consensus/consensus_peers.cc:491)
@ 0x7fb80ef0b6b4 yb::consensus::Peer::SendNextRequest(yb::consensus::RequestTriggerMode) (src/yb/consensus/consensus_peers.cc:300)
@ 0x7fb808639763 yb::ThreadPool::DispatchThread(bool) (src/yb/util/threadpool.cc:608)
@ 0x7fb808635405 std::function<void ()>::operator()() const (gcc/5.5.0_4/include/c++/5.5.0/functional:2267)
@ 0x7fb808635405 yb::Thread::SuperviseThread(void*) (src/yb/util/thread.cc:603)
@ 0x7fb804652693 start_thread (/tmp/glibc-20181130-26094-cs1x60/glibc-2.23/nptl/pthread_create.c:333)
@ 0x7fb803d8f41c (unknown) (sysdeps/unix/sysv/linux/x86_64/clone.S:109)
@ 0xffffffffffffffff
*** Check failure stack trace: ***
@ 0x7fb8085c6b3b DumpStackTraceAndExit (src/yb/util/logging.cc:166)
@ 0x7fb805cef8dc
@ 0x7fb805cf17ec
@ 0x7fb805cef439
@ 0x7fb805cf22ae
@ 0x7fb80862089b yb::RefCntBuffer::RefCntBuffer(unsigned long) (src/yb/util/ref_cnt_buffer.cc:30)
@ 0x7fb80a601f33 yb::rpc::serialization::SerializeHeader(google::protobuf::MessageLite const&, unsigned long, yb::RefCntBuffer*, unsigned long, unsigned long*) (src/yb/rpc/serialization.cc:116)
@ 0x7fb80a5cd566 yb::rpc::OutboundCall::SetRequestParam(google::protobuf::Message const&) (src/yb/rpc/outbound_call.cc:244)
@ 0x7fb80a5d697b yb::rpc::Proxy::AsyncRequest(yb::rpc::RemoteMethod const*, google::protobuf::Message const&, google::protobuf::Message*, yb::rpc::RpcController*, std::function<void ()>) (src/yb/rpc/proxy.cc:125)
@ 0x7fb80cb66b60 yb::consensus::ConsensusServiceProxy::UpdateConsensusAsync(yb::consensus::ConsensusRequestPB const&, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcController*, std::function<void ()>) (src/yb/consensus/consensus.proxy.cc:32)
@ 0x7fb80ef07ae6 yb::consensus::RpcPeerProxy::UpdateAsync(yb::consensus::ConsensusRequestPB const*, yb::consensus::RequestTriggerMode, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcController*, std::function<void ()> const&) (src/yb/consensus/consensus_peers.cc:491)
@ 0x7fb80ef0b6b4 yb::consensus::Peer::SendNextRequest(yb::consensus::RequestTriggerMode) (src/yb/consensus/consensus_peers.cc:300)
@ 0x7fb808639763 yb::ThreadPool::DispatchThread(bool) (src/yb/util/threadpool.cc:608)
@ 0x7fb808635405 std::function<void ()>::operator()() const (gcc/5.5.0_4/include/c++/5.5.0/functional:2267)
@ 0x7fb808635405 yb::Thread::SuperviseThread(void*) (src/yb/util/thread.cc:603)
@ 0x7fb804652693 start_thread (/tmp/glibc-20181130-26094-cs1x60/glibc-2.23/nptl/pthread_create.c:333)
@ 0x7fb803d8f41c (unknown) (sysdeps/unix/sysv/linux/x86_64/clone.S:109)
@ 0xffffffffffffffff
(END)
They refer to a file called yb-tserver.FATAL.details.2019-03-13T23_58_30.pid19726.txt
, the contents of which are:
F20190313 23:58:30 ../../src/yb/util/ref_cnt_buffer.cc:30] Check failed: data_ != nullptr
@ 0x7fb8085c7db3 yb::LogFatalHandlerSink::send(int, char const*, char const*, int, tm const*, char const*, unsigned long) (src/yb/util/logging.cc:474)
@ 0x7fb805cf1c05
@ 0x7fb805cef439
@ 0x7fb805cf22ae
@ 0x7fb80862089b yb::RefCntBuffer::RefCntBuffer(unsigned long) (src/yb/util/ref_cnt_buffer.cc:30)
@ 0x7fb80a601f33 yb::rpc::serialization::SerializeHeader(google::protobuf::MessageLite const&, unsigned long, yb::RefCntBuffer*, unsigned long, unsigned long*) (src/yb/rpc/serialization.cc:116)
@ 0x7fb80a5cd566 yb::rpc::OutboundCall::SetRequestParam(google::protobuf::Message const&) (src/yb/rpc/outbound_call.cc:244)
@ 0x7fb80a5d697b yb::rpc::Proxy::AsyncRequest(yb::rpc::RemoteMethod const*, google::protobuf::Message const&, google::protobuf::Message*, yb::rpc::RpcController*, std::function<void ()>) (src/yb/rpc/pro
@ 0x7fb80cb66b60 yb::consensus::ConsensusServiceProxy::UpdateConsensusAsync(yb::consensus::ConsensusRequestPB const&, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcController*, std::function<void ()>
@ 0x7fb80ef07ae6 yb::consensus::RpcPeerProxy::UpdateAsync(yb::consensus::ConsensusRequestPB const*, yb::consensus::RequestTriggerMode, yb::consensus::ConsensusResponsePB*, yb::rpc::RpcController*, std::
@ 0x7fb80ef0b6b4 yb::consensus::Peer::SendNextRequest(yb::consensus::RequestTriggerMode) (src/yb/consensus/consensus_peers.cc:300)
@ 0x7fb808639763 yb::ThreadPool::DispatchThread(bool) (src/yb/util/threadpool.cc:608)
@ 0x7fb808635405 std::function<void ()>::operator()() const (gcc/5.5.0_4/include/c++/5.5.0/functional:2267)
@ 0x7fb808635405 yb::Thread::SuperviseThread(void*) (src/yb/util/thread.cc:603)
@ 0x7fb804652693 start_thread (/tmp/glibc-20181130-26094-cs1x60/glibc-2.23/nptl/pthread_create.c:333)
@ 0x7fb803d8f41c (unknown) (sysdeps/unix/sysv/linux/x86_64/clone.S:109)
@ 0xffffffffffffffff
Not sure if there was any data loss yet (I hope not!)