CrashLoopBackOff issue k3s raspberry pi

Hi There,

Hoping someone can help. I’ve been trying to get a basic install of yugabyte working on my k3s raspberry pi cluster, and seem to be running into some issues that I cannot resolve. Hoping someone might be able to help.

I’ve installed yugabyte using the following helm instruction:

helm install yb-db yugabytedb/yugabyte --version 2.21.1 --set resource.master.requests.cpu=0.5,resource.master.requests.memory=0.5Gi,resource.tserver.requests.cpu=0.5,resource.tserver.requests.memory=0.5Gi,replicas.master=1,replicas.tserver=1 --namespace default

Basically I can see the pods beginning to start, then they go to Error state and CrashLoopBackOff:

yb-master-0 2/3 CrashLoopBackOff 32 (4m37s ago) 21h
yb-tserver-0 2/3 CrashLoopBackOff 32 (4m39s ago) 21h

When I look at the logs I see the following through the master using kubectl logs yb-master-0:

2024-07-31T09:38:43.341165149+01:00 stdout F disk check at: Wed Jul 31 08:38:43 UTC 2024
2024-07-31T09:38:43.498846098+01:00 stdout F DNS addr resolve: yb-master-0.yb-masters.default.svc.cluster.local
2024-07-31T09:38:43.509008763+01:00 stdout F DNS addr resolve success.
2024-07-31T09:38:43.509105727+01:00 stdout F Bind ipv4: 10.42.1.189 port 7100
2024-07-31T09:38:43.509231042+01:00 stdout F Bind success.
2024-07-31T09:38:43.753567692+01:00 stdout F DNS addr resolve: yb-master-0.yb-masters.default.svc.cluster.local
2024-07-31T09:38:43.760586563+01:00 stdout F DNS addr resolve success.
2024-07-31T09:38:43.76074786+01:00 stdout F Bind ipv4: 10.42.1.189 port 7100
2024-07-31T09:38:43.760776082+01:00 stdout F Bind success.
2024-07-31T09:38:43.935572437+01:00 stdout F DNS addr resolve: 0.0.0.0
2024-07-31T09:38:43.941075392+01:00 stdout F DNS addr resolve success.
2024-07-31T09:38:43.941158059+01:00 stdout F Bind ipv4: 0.0.0.0 port 7000
2024-07-31T09:38:43.941283763+01:00 stdout F Bind success.
2024-07-31T09:38:44.097750446+01:00 stderr F 2024-07-31 08:38:44,097 [INFO] k8s_parent.py: Core files will be copied to '/mnt/disk0/cores'
2024-07-31T09:38:44.100466813+01:00 stderr F 2024-07-31 08:38:44,100 [INFO] k8s_parent.py: core_pattern is: core
2024-07-31T09:38:44.101132001+01:00 stderr F 2024-07-31 08:38:44,100 [INFO] k8s_parent.py: Executing operation: yb-master-0_pre_debug_hook filepath: /opt/debug_hooks_config/yb-master-0-
pre_debug_hook.sh
2024-07-31T09:38:44.111197055+01:00 stderr F 2024-07-31 08:38:44,110 [INFO] k8s_parent.py: Output from hook b'hello-from-pre\n'
2024-07-31T09:38:44.185769914+01:00 stderr F 2024-07-31 08:38:44,185 [INFO] k8s_parent.py: Executing operation: yb-master-0_post_debug_hook filepath: /opt/debug_hooks_config/yb-master-0
-post_debug_hook.sh
2024-07-31T09:38:44.212360481+01:00 stderr F 2024-07-31 08:38:44,197 [INFO] k8s_parent.py: Output from hook b'hello-from-post\n'
2024-07-31T09:38:44.2124595+01:00 stderr F 2024-07-31 08:38:44,198 [INFO] k8s_parent.py: core_pattern is: core
2024-07-31T09:38:44.212473204+01:00 stderr F 2024-07-31 08:38:44,199 [INFO] k8s_parent.py: Skipping copy of core files: '/mnt/disk0/cores' and '/mnt/disk0/cores' are the same directorie
s
2024-07-31T09:38:44.212482556+01:00 stderr F 2024-07-31 08:38:44,199 [INFO] k8s_parent.py: Copied 0 core files to '/mnt/disk0/cores'

Then nothing else. When I look at the container logs on the individual nodes, I see the following:

2024-07-31T09:22:59.824522736+01:00 stdout F 2024-07-31T08:22:59.805Z	info	server/main.go:100	main.main	Logger initialized with info level logging
2024-07-31T09:22:59.925528074+01:00 stderr F 2024-07-31T08:22:59.924Z	error	helpers/utils.go:208	apiserver/cmd/server/helpers.(*HelperContainer).AttemptGetRequests	all reque
sts to list of urls failed: [http://yb-master-0.yb-masters.default.svc.cluster.local:7000/api/v1/tablet-servers]
2024-07-31T09:22:59.925551074+01:00 stdout F 2024-07-31T08:22:59.924Z	error	helpers/utils.go:208	apiserver/cmd/server/helpers.(*HelperContainer).AttemptGetRequests	all reque
sts to list of urls failed: [http://yb-master-0.yb-masters.default.svc.cluster.local:7000/api/v1/tablet-servers]
2024-07-31T09:22:59.925639887+01:00 stdout F 2024-07-31T08:22:59.924Z	warn	helpers/utils.go:338	apiserver/cmd/server/helpers.(*HelperContainer).BuildMasterURLsAndAttemptGetReque
sts	get requests to cached master addresses [http://yb-master-0.yb-masters.default.svc.cluster.local:7000/api/v1/tablet-servers] failed: all requests to list of urls failed: [http:/
/yb-master-0.yb-masters.default.svc.cluster.local:7000/api/v1/tablet-servers]
2024-07-31T09:22:59.926495202+01:00 stdout F 2024-07-31T08:22:59.926Z	warn	helpers/masters_http_request.go:71	apiserver/cmd/server/helpers.(*HelperContainer).GetMastersFuturefailed to get masters list from tserver at yb-master-0.yb-masters.default.svc.cluster.local: Get "http://yb-master-0.yb-masters.default.svc.cluster.local:9000/api/v1/masters": dial tcp 
10.42.1.189:9000: connect: connection refused
2024-07-31T09:22:59.927328055+01:00 stdout F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:208	apiserver/cmd/server/helpers.(*HelperContainer).AttemptGetRequests	all reque
sts to list of urls failed: [http://yb-master-0.yb-masters.default.svc.cluster.local:7000/api/v1/masters]
2024-07-31T09:22:59.927632771+01:00 stdout F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:280	apiserver/cmd/server/helpers.(*HelperContainer).GetMasterAddressesFuture	f
ailed to get masters from master and tserver at yb-master-0.yb-masters.default.svc.cluster.local: all requests to list of urls failed: [http://yb-master-0.yb-masters.default.svc.cluster
.local:7000/api/v1/masters]
2024-07-31T09:22:59.927785898+01:00 stdout F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:239	apiserver/cmd/server/helpers.(*HelperContainer).BuildMasterURLs	failed to get mas
ter addresses
2024-07-31T09:22:59.9278611+01:00 stdout F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:343	apiserver/cmd/server/helpers.(*HelperContainer).BuildMasterURLsAndAttemptGetReque
sts	failed to build master urls
2024-07-31T09:22:59.927916839+01:00 stdout F 2024-07-31T08:22:59.927Z	warn	helpers/db_connections.go:30	apiserver/cmd/server/helpers.(*HelperContainer).CreateGoCqlClient	f
ailed to get list of tservers for gocql client setup: failed to get masters from master and tserver at yb-master-0.yb-masters.default.svc.cluster.local: all requests to list of urls fai
led: [http://yb-master-0.yb-masters.default.svc.cluster.local:7000/api/v1/masters]
2024-07-31T09:22:59.927993227+01:00 stdout F 2024-07-31T08:22:59.927Z	info	helpers/db_connections.go:48	apiserver/cmd/server/helpers.(*HelperContainer).CreateGoCqlClient	i
nitializing gocql client with initial addresses: [yb-master-0.yb-masters.default.svc.cluster.local:9042]
2024-07-31T09:22:59.927804934+01:00 stderr F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:208	apiserver/cmd/server/helpers.(*HelperContainer).AttemptGetRequests	all reque
sts to list of urls failed: [http://yb-master-0.yb-masters.default.svc.cluster.local:7000/api/v1/masters]
2024-07-31T09:22:59.928177352+01:00 stderr F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:280	apiserver/cmd/server/helpers.(*HelperContainer).GetMasterAddressesFuture	f
ailed to get masters from master and tserver at yb-master-0.yb-masters.default.svc.cluster.local: all requests to list of urls failed: [http://yb-master-0.yb-masters.default.svc.cluster
.local:7000/api/v1/masters]
2024-07-31T09:22:59.928247332+01:00 stderr F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:239	apiserver/cmd/server/helpers.(*HelperContainer).BuildMasterURLs	failed to get mas
ter addresses
2024-07-31T09:22:59.928360756+01:00 stderr F 2024-07-31T08:22:59.927Z	error	helpers/utils.go:343	apiserver/cmd/server/helpers.(*HelperContainer).BuildMasterURLsAndAttemptGetReque
sts	failed to build master urls
2024-07-31T09:22:59.93508356+01:00 stderr F 2024-07-31T08:22:59.934Z	error	server/main.go:142	main.main	Error initializing the gocql session.
2024-07-31T09:22:59.953710283+01:00 stderr F 2024-07-31T08:22:59.934Z	error	server/main.go:143	main.main	gocql: unable to create session: unable to discover protocol vers
ion: dial tcp 10.42.1.189:9042: connect: connection refused
2024-07-31T09:22:59.953791485+01:00 stderr F using embed mode
2024-07-31T09:22:59.935144188+01:00 stdout F 2024-07-31T08:22:59.934Z	error	server/main.go:142	main.main	Error initializing the gocql session.
2024-07-31T09:22:59.953814984+01:00 stdout F 2024-07-31T08:22:59.934Z	error	server/main.go:143	main.main	gocql: unable to create session: unable to discover protocol vers
ion: dial tcp 10.42.1.189:9042: connect: connection refused
2024-07-31T09:22:59.953839706+01:00 stdout F 
2024-07-31T09:22:59.953848484+01:00 stdout F    ____    __
2024-07-31T09:22:59.953856354+01:00 stdout F   / __/___/ /  ___
2024-07-31T09:22:59.953863817+01:00 stdout F  / _// __/ _ \/ _ \
2024-07-31T09:22:59.953871372+01:00 stdout F /___/\__/_//_/\___/ v4.10.2
2024-07-31T09:22:59.953879409+01:00 stdout F High performance, minimalist Go web framework
2024-07-31T09:22:59.953887149+01:00 stdout F https://echo.labstack.com
2024-07-31T09:22:59.953894649+01:00 stdout F ____________________________________O/_______
2024-07-31T09:22:59.953902186+01:00 stdout F                                     O\
2024-07-31T09:22:59.95390976+01:00 stdout F ⇨ http server started on [::]:15433

Any ideas where to even begin looking as to what the problem may be?

Hi @scm7mae

I don’t think we’ve ever tested with raspberry. Note that the hardware is lower than our recommended ones: Deployment checklist for YugabyteDB clusters | YugabyteDB Docs

Please fix the requirements first.

There should be some logs from yb-master. Or core dump files if it crashed.

Hi @dorian_yugabyte,

Thanks for your speedy response. Sorry, just wanted to check. I saw that 2GB and 2 cores were listed as the minimum. I’m using raspberry pi 4 models, with 4GB RAM and quad core ARM CPU. Does that not pass the minimum hw spec?

If it’s ok, stupid question, where do I find the core dump files on the node? Sorry I’m fairly new to linux.

The requirements are for each node having 1 process of yb-tserver/master. Your containers are using 0.5g as example in your helm command, when they should be using 2G+.

Also CPUs aren’t the same. A core of rpi might be 1/3 - 1/5 the speed of a normal cpu core (looking at BCM2711 Benchmark)

You’ve started on highest difficulty possible (no linux + kubernetes + hardware/OS that has no chance to be running in production).

So I would fix those first (all of them). If you want to test a local cluster on a laptop would be better.

Hi @dorian_yugabyte.

So I’ve had success. I bought a new raspberry pi 5, and did a basic install of an OS, then got YugabyteDB running using docker. Next, I’ve added that node to my k3s cluster, and run the specific command as follows:

helm install yb-db yugabytedb/yugabyte --version 2.21.1 --set nodeSelector.databases=yugadb,resource.master.requests.cpu=1.5,resource.tserver.requests.cpu=1.5,storage.tserver.storageClass=longhorn,storage.master.storageClass=longhorn,storage.tserver.size=2Gi,storage.master.size=2Gi,replicas.tserver=1,replicas.master=1,partition.master=1,partition.tserver=1 --namespace default

This seems to be stable and working at the moment. I will report back, but hopefully this will be helpful to others maybe.

@scm7mae thanks for sharing. Good to know that it works on Raspberry Pi
Maybe one day we will run Yugabyte on a cluster like this one: A Temporal History of The World’s Largest Raspberry Pi Cluster (that we know of) | by Chris Bensen | Oracle Developers | Medium :nerd_face: