For High Availability HomeLab, due to quorum requirements, minimum nodes count is three. But if the HomeLab is on budget then often is used some low power and resource consuming node as cluster Arbiter. This Arbiter is used just for quorum management purposes and nothing else.
TL;DR
I have only 2 regular cluster nodes where I’m running Proxmox. and the Proxmox uses Corosync for the cluster nodes management. Corosync has well known possibility to define one node only as Arbiter by the corosync-qnet service.
But the Corosync is only one piece of puzzle of th HA cluster. Next one is Replicated filesystem. For small budget sensitive HomeLab, GlusterFS is nice choice. From the GlusterFS documentation the Thin Arbiter is here also available. But when I started to looking for a way how to make GlusterFS Thin Arbiter Up and Running, I have been faced to very low amount of unclear resources. But “Person grows with tasks”. Then I started some research activities and finally I can share this guide with you.
Thin Arbiter Node
Important Notice
I’m using AlmaLinux distribution for my VM. I run the Thin Arbiter as VM on my HomeLab NAS and then the guide is focused on AlmaLinux (or other CentOS based) distribution. But I believe that it can be used on other Linux flawors without any complications too.
Enable GlusterFS Repository on AlmaLinux
The GlusterFS binaries are part of the CentOS SIG (Special Interest Group) repository. So we have to enable this repository. Additionally, the GlusterFS version on Proxmox 8 (what I’m currently using) is 10, then I’m using same version on my Thin Arbiter Node:
|
|
Now we are ready for Thin Arbiter “Magic”. If you take look into the GlusterFS Thin Arbiter documentation, then you discover cryptic command:
|
|
which doesn’t work. For example it expects a volume file, but this file doesn’t exist. If you try to find a recipe how-to create the file or where to get one, you don’t get clean answer. I found a clue by one issue related to GlusterFS on the Github where a setup-thin-arbiter.sh script has been mentioned. Then I found this at the /extras/ sub-folder of the glusterfs repository where the requested thin-arbiter.vol file resides also.
Finally, I have found fosdem presentation related to the Thin Arbiter and this pushed me to the right way. So, here is the recipe:
Install and Configure the Thin Arbiter
Thin Arbiter has own package and then we can install this as any other Linux package:
|
|
Now we can use the setup-thin-arbiter.sh script. You can run it without parameter to get some additional info but we can go directly to the setup process:
|
|
You can specify here any path what you want, but based on the GlusterFS Brick Naming Conventions I’m using this path: /data/glusterfs/brick_ta
|
|
Type Y here
|
|
Voila ! The Thin Arbiter is up and running as the Linux service. You can verify this:
|
|
Now, the Thin Arbiter node is ready to accept connections from regular GlusterFS nodes for your volume(s). Yes, you can configure more then one GlusterFS volume with this Thin Arbiter and only this one brick as you can see from setup script messages. We just need to enable this Thin Arbiter communication on the firewall. The Thin Arbiter doesn’t need to enable same ports as regular glusterFS node and then is enough to allow the TCP 24007 port only. You can restrict this rule also only to IP addresses of the regular nodes, but it’s not necessary in cluster protected environment.
|
|
Let’s continue to setup regular GlusterFS data nodes, in my case, Proxmox instances.
Regular Nodes
HD or SSD drives on regular nodes must be configured and mounted to allow to create GlusterFS Volume Bricks at first. This is well docummented and straighforward process but I provide here quick how-to steps:
Install the GlusgterFS Server on each Proxmox node if not already installed:
|
|
List harddrives available on the system to identify the disk for GlusterFS brick(s):
|
|
In my case, the disk designatged for GlusterFS is /dev/sdb.
I can use whole disk for the GlusterFS and then I’m overwrite whold disk partition table, and create new partition with 100% disk size. The new partition table must be GPT type. I’m fan of as optimal as possible ways to getting things done, so here is command to do al this in one step:
|
|
WARNING!! the command doesn’t prompt for anything then be sure that you specify correct disk! If you are unsure, or can use different partition scheme, I recommend to use classical fdisk way. You can find many tutorials how to do that on the Internet.
Format the new partition by xfs filesystem:
|
|
You can change the partition label “glusterfs_brick1” on whatever you want and fit your disk management system.
For next step we must identify the partitioon GUID to allow to mount it correctly:
|
|
Because the GlusterFS is user space filesystem, then we need to mount the partition now. The GlusterFS documentation recommends to create mount points at /data sub-folders. See GlusterFS Brick Naming Conventions. Based on this I have created this mount point for my GlusterFS pve volume:
|
|
And then add this line to the /etc/fstab to mount previously created partition to this mount point. UUID must be the UUID discovered by lsblk command above, in my case 0fc63daf….
|
|
now, verify the result of previous operations:
|
|
We can see, that sdb1 XFS partition is mounted at /data/glusterfs/pve
Finally, set correct attributes to the GlusterFS volume locations:
|
|
If you don’t provide this, GlusgterFs will complain during brick creation process.
!!! Do not create brick subfolders for your volume here. This will be done later during volume bring-up process !!!
Now we must allow GlusterFS communication on the Proxmox Firewall. Because the Proxmox 8 is using GlusterFS version 10, here is one very important change in networking behavior:
From Gluster-10 onwards, the brick ports will be randomized. A port is randomly selected within the range of base-port to max-port as defined in the glusterd.vol file and then assigned to the brick. For example: if you have five bricks, you need to have at least 5 ports open within the given range of base-port and max-port. To reduce the number of open ports (for best security practices), one can lower the max-port value in the glusterd.vol file and restart glusterd to get it into effect.
Because I don’t like too much randomness in my network environment, I have restricted this new random port range little bit by modyfying the /etc/glusterfs/glusterd.vol file. GlusterFS uses one port per brick, so this port range must be equal or higher than bricks count:
|
|
Finally, create appropriate firewall rules to allow GlusterFS server communicate with other bricks. Please allow these ports, bricks port range modify to fit your glusterd.vol file modifications or set it to default range 49152-60999 If you do not modified the glusterd.vol file.
Required ports:
- 24007 - GlusterD
- 24008 - GlusterD RDMA port management
- 49152~49155 - Brick ports
Start GlusterFS service:
|
|
And verify that service is correctly runing:
|
|
Now we have ready all bricks for our GlusterFS wall. Let’s go to join them together.
Getting Up the Thin Arbiter GlusterFS Volume
For this task purpose, expect these nodes and it’s appropriate DNS or /etc/hosts records:
- pve1-storage
- pve2-storage
- thin-arbiter
Important notice:
These steps must be provided on one of the real GlustgerFS nodes, not on the Thin Arbiter node!! I’ll provide full setup on the pve1-storage node.
At first, inform nodes about each other and establish it’s trust:
|
|
Now we can create the GlusterFS volume. !!The Thin Arbiter brick must be specified as last brick on the command line!! You can see that I’m not use force parameter what is very common on other GlusterFS recipes. It’s because we don’t create brick sub-folders and set correct volume folder attributes by xattr in previous steps. I think that let things go through without warnings or errors is the best way to getting things done:
|
|
Finaly, start the volume and verify it’s status:
|
|
You can see that Thin Arbiter is not mentioned here as a node, same as if you try to show peer status:
|
|
How we can verify that the Thin Arbiter is working correctly then? Of course, you can verify quorum by shuttting down nodes and writing to the volume, but the easiest way id to verify that a special Thin Arbiter Volume file is present at it’s brick location. So login to the thin-arbiter node and execute:
|
|
If you can see this special file with beginning trusted.afr.<volume_name>-ta-2.xxxx followed by GUID then the Thin Arbiter is functional, because the file is used to maintain volume arbiter status.
If you are curious, you can verify that everything works by mount a GlusterFS client to the volume and try to write to the volume during shutting down some nodes. With Thin Arbiter write to node even the second one is down is still possible and when you bring up the node back, node will be healed and both voilume bricks will hold exactly same data.
So, that’s all about the GlusterFS and Thin Arbiter. I believe that you will find this article useful.