You can use a small device like a Raspberry Pi as a Qdevice to be the third vote in quorum. It doesn’t have to be a full Proxmox server.
Self Hosted - Self-hosting your services.
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules
- No harassment
- crossposts from c/Open Source & c/docker & related may be allowed, depending on context
- Video Promoting is allowed if is within the topic.
- No spamming.
- Stay friendly.
- Follow the lemmy.ml instance rules.
- Tag your post. (Read under)
Important
Beginning of January 1st 2024 this rule WILL be enforced. Posts that are not tagged will be warned and if not fixed within 24h then removed!
- Lemmy doesn't have tags yet, so mark it with [Question], [Help], [Project], [Other], [Promoting] or other you may think is appropriate.
Cross-posting
- [email protected] is allowed!
- [email protected] is allowed!
- [email protected] is allowed!
- [email protected] is allowed if topic has to do with selfhosting.
- [email protected] is allowed!
If you see a rule-breaker please DM the mods!
I would argue that the node shouldn’t be in the cluster if its availability doesn’t match the others. If you remove the part-time node, your pvecm concerns go away.
Now, if you have a failure such that the other 2 nodes get restarted, you can manage the VM startups with delays. If one node completes booting 5 minutes before the other, then have the VMs wait 5 minutes or longer before auto-starting. That way, you’ll have your quorum when the VM starts.
I have 2 nodes and a raspberry pi as a qdevice.
I can still power off 1 node (so I have 1 node and an rpi) if I want to.
To avoid split brain, if a node can see the qdevice then it is part of the cluster. If it can't, then the node is in a degraded state.
Qdevices are only recommended in some scenarios, which I can't remember off the top of my head.
With 2 nodes, you can't set up CEPH cluster (well, I don't think you can).
But you can set up High Availability, and use ZFS snapshot replication on a 5 minute interval (so, if your VMs host goes down, the other host can start it with a potentially outdated snapshot).
This worked for my project as I could have a few stateless services that could bounce between nodes, and I had a postgres VM with streaming replication (postgres not ZFS) and failover. Which lead to a decently fault tolerant setup.
I will have to look into the qdevice. I do have an old PI3 setup as a software defined radio. I might be able to also set it up as a qdevice.
https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support
Looking at the documentation it isn't recommended to use a a qdevice in a odd number node. I guess I technically have.
If the QNet daemon itself fails, no other node may fail or the cluster immediately loses quorum. For example, in a cluster with 15 nodes, 7 could fail before the cluster becomes inquorate. But, if a QDevice is configured here and it itself fails, no single node of the 15 may fail. The QDevice acts almost as a single point of failure in this case.
But it seems to be more of an issue in large node clusters. In my situation I don't think this is a big deal because if the qdevice fails and my third server is offline I am in the same situation I am now.
Just out of ceriosity do you backup your PI at all? Not sure what the recovery process is if the Qdevice fails how easy is it to replace resetup.
Please do add a tag to your post as stated on the sublemmy sidebar! Thank you. :)
ok
You'll need a QDevice to keep consensus. That wiki article will cover how to set it up and some drawbacks to QDevices. You should be able to run it on a low-power device like a Pi to keep the cluster going.
AFAIK forcing it to a lower number is fine if you're not doing HA. I remember reading something along those lines on a forum, but I could be remembering wrong.
If you're not using Ceph or HA, then I don't think there would be any negative effects from not having all the servers in the cluster ready.
Oh good, I am not using any of those at least not at the moment.
Oh good, I am not using any of those at least not at the moment.
If you are not using any HA feature and only put servers into the same cluster for ease of management.
You could use the same command but with a value of 1.
The reason quorum exist is to prevent any server to arbitrarily failover VMs when it believes the other node(s) is down and create a split brain situation.
But if that risk does not exist to begin with, so do the quorum.
I haven't tested this at all, it's just popped into my head, but, could you create a VM on one of the nodes and join that to the cluster?
If it does work, I wouldn't recommend it. But I'd be curious to see if that would work.
That leads to a chicken and egg situation. The Proxmox cluster can't turn on VM because the VM isn't on to be the third node in the cluster number :)