this post was submitted on 15 Jun 2023
10 points (100.0% liked)

Self Hosted - Self-hosting your services.

11419 readers
1 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules

Important

Beginning of January 1st 2024 this rule WILL be enforced. Posts that are not tagged will be warned and if not fixed within 24h then removed!

Cross-posting

If you see a rule-breaker please DM the mods!

founded 3 years ago
MODERATORS
 

How do you handle Proxmox clusters when you only have 1 or 2 servers?

I technically have 3 servers but I keep one offline because I don't need it 24/7 most point wasting power on a server I don't need.

I believe I read somewhere that you can force Proxmox it to a lower number but it isn't recommended. Has anyone done this and if so have you run into any issues with this?

My main issue is I want my VM to start no matter what. For example I had a power outage. When the servers came back online instead of starting they waited for the quorum number to reach 3. (it will never reach 3 because the third server wasn't turn on.) so they just waited forever until I got home and ran

pvecm expected 2

top 13 comments
sorted by: hot top controversial new old
[–] 4am 2 points 1 year ago

You can use a small device like a Raspberry Pi as a Qdevice to be the third vote in quorum. It doesn’t have to be a full Proxmox server.

[–] HybridSarcasm 2 points 1 year ago

I would argue that the node shouldn’t be in the cluster if its availability doesn’t match the others. If you remove the part-time node, your pvecm concerns go away.

Now, if you have a failure such that the other 2 nodes get restarted, you can manage the VM startups with delays. If one node completes booting 5 minutes before the other, then have the VMs wait 5 minutes or longer before auto-starting. That way, you’ll have your quorum when the VM starts.

[–] [email protected] 1 points 1 year ago (1 children)

I have 2 nodes and a raspberry pi as a qdevice.
I can still power off 1 node (so I have 1 node and an rpi) if I want to.
To avoid split brain, if a node can see the qdevice then it is part of the cluster. If it can't, then the node is in a degraded state.
Qdevices are only recommended in some scenarios, which I can't remember off the top of my head.

With 2 nodes, you can't set up CEPH cluster (well, I don't think you can).
But you can set up High Availability, and use ZFS snapshot replication on a 5 minute interval (so, if your VMs host goes down, the other host can start it with a potentially outdated snapshot).

This worked for my project as I could have a few stateless services that could bounce between nodes, and I had a postgres VM with streaming replication (postgres not ZFS) and failover. Which lead to a decently fault tolerant setup.

[–] [email protected] 1 points 1 year ago

I will have to look into the qdevice. I do have an old PI3 setup as a software defined radio. I might be able to also set it up as a qdevice.

https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support

Looking at the documentation it isn't recommended to use a a qdevice in a odd number node. I guess I technically have.

If the QNet daemon itself fails, no other node may fail or the cluster immediately loses quorum. For example, in a cluster with 15 nodes, 7 could fail before the cluster becomes inquorate. But, if a QDevice is configured here and it itself fails, no single node of the 15 may fail. The QDevice acts almost as a single point of failure in this case.

But it seems to be more of an issue in large node clusters. In my situation I don't think this is a big deal because if the qdevice fails and my third server is offline I am in the same situation I am now.

Just out of ceriosity do you backup your PI at all? Not sure what the recovery process is if the Qdevice fails how easy is it to replace resetup.

[–] [email protected] 1 points 1 year ago (1 children)

Please do add a tag to your post as stated on the sublemmy sidebar! Thank you. :)

[–] [email protected] 1 points 1 year ago

You'll need a QDevice to keep consensus. That wiki article will cover how to set it up and some drawbacks to QDevices. You should be able to run it on a low-power device like a Pi to keep the cluster going.

[–] [email protected] 1 points 1 year ago (2 children)

AFAIK forcing it to a lower number is fine if you're not doing HA. I remember reading something along those lines on a forum, but I could be remembering wrong.

If you're not using Ceph or HA, then I don't think there would be any negative effects from not having all the servers in the cluster ready.

[–] [email protected] 1 points 1 year ago

Oh good, I am not using any of those at least not at the moment.

[–] [email protected] 1 points 1 year ago

Oh good, I am not using any of those at least not at the moment.

[–] [email protected] 1 points 1 year ago

If you are not using any HA feature and only put servers into the same cluster for ease of management.

You could use the same command but with a value of 1.

The reason quorum exist is to prevent any server to arbitrarily failover VMs when it believes the other node(s) is down and create a split brain situation.

But if that risk does not exist to begin with, so do the quorum.

[–] [email protected] 0 points 1 year ago (1 children)

I haven't tested this at all, it's just popped into my head, but, could you create a VM on one of the nodes and join that to the cluster?

If it does work, I wouldn't recommend it. But I'd be curious to see if that would work.

[–] [email protected] 1 points 1 year ago

That leads to a chicken and egg situation. The Proxmox cluster can't turn on VM because the VM isn't on to be the third node in the cluster number :)