...a big thumbs up from me :) Your video helped me a lot. Others are right, a pill of specific knowledge in 15 minutes. And I don't speak English well and I've been 60 for a month :) Best regards, Sedece :)
Sorry, I have to add something :) What I missed in your video was the appropriate network configuration that will enable the best operation of the HA cluster. I had to delve into the documentation. It's not a defect of the video, but I had a problem with data transfer. They recommend 10Gbps or more... Regards.
Just a note when you tested migration - there was a single ping downtime, MacOS pings miss a seq number (before going to host unreachable) - they dont show timeouts like Windows. Still a great demonstration though.
3 nodes is usally the standard for redundancy in producrion enviromnets to more easliy do maintenances on 1 node while offloading all the vms to the other 2, but a 2 node cluster is also possible, thank you for watching!
thanks for the explanation & appreciated with the tutorials . I just get confused with the HA , I thought setting up & join cluster the proxmox will have cluster . creating zfs storage in one of the node wont have HA right ? So I need to create OSD drives pool only it will have HA function ? If OSD drive is conventional hard drive lets say 16TB non ssd drive will the performance drop & have issues ? Thanks
HA will only work if you have ceph configured with more than 1 drive, or if you are using zfs replication across multiple drives, for the 16tb drive your vms sill have the performance of 1 drive so around 100MB/s r/w. Thank You for watching :)
Excellent video, clear and concise. 2 questions, can you HA a VM that has a PCI passthru if you have it on several but not all nodes ? And what happen if you have more storage on one node than others in ceph ?
for pcie passthrough I haven't fully tested it but I believe is not supported, ceph will make the volumes as big as your smaller drive that's why you can almost identical drives on each node, thank you for watching!
If you shut down the node, the VM has to restart. It's also shown in this video, that Rocky VM is restarting. Only if all nodes are up, you can do live migration without restarting. The reason is the content in the RAM of the machine with the node that's shutting down. It cannot be copied if there is power loss in one node. On a planned shutdown you can give a setting in the last HA part of the video, that you set it to 'migrate', if I remember correctly.
13:35 request state, I think you can set it to 'migrate' there. Not 100% sure. But if I remember correctly, the VMs are auto-migrated before the machine is shutting down.
Yes if the node loses all power all of a sudden the vm needs to restart, the contents of ram would not have time for migrate, but the vm harddrive should be replicated in the other nodes with ceph, so the vm should be able to boot back up. I will do some testings to confirm. Thank You for watching ! :)
I have a question ? can it be possible when a node fail can be transfer just for a specific node? and where i can change that ? thanks for you support on your videos
One of the best demonstrations I've ever seen. Even though I've never worked on Proxmox before, I understood everything you just did. The only drawback is if you use the Community Edition in a disconnected network (no internet connection), how are you going to install Ceph? Can you shed some light on that?
You would need internet to update and install paackages fisrt, if that is not an option then creating a local repo of the proxmox repository and configuring that new repo on the nodes.
nice work, noob question coming.. in this example you used 3x 480GB discs for your 3 OSD's ... does that give you a total useable pool size of roughly 1.4TB (stripe)? Or Is it giving you the useable total pool size of 1 drive? (i.e. ceph consumes 3 HD's but only 1 HD useable total space)? I ask before doing this on 3x 24-bay servers... want to make sure I have enough useable storage.
It gives the total amount of 1 drive, think about it as a 3 way mirror where the data and vm disks are replicated on the 3 drives, thank you for watching :), let me know if you have any other questions.
@@distrodomain my understanding was that ceph got better / faster / larger with each node. Does adding a 4th 480GB node make it a 4-way mirror? If so how would one ever go about increasing the pool size aside from adding 1 extra hdd to every node? What happens if you only add 1 extra hdd to one node but not the others? thanks again!
you can add additional disks to each node like another 480gb ssd, to have 2 on each node, for this new drive you create an osd the same way and add it to the pool, to keep things consistent I recommend using same amount of drive and capacity in each node.
@Distro Domain: You didn't show the creation of the VM. You showed that the migration is immediate but does not mention why. I suppose that you've placed the disk(s) of it into pve-storage.
When you ceate the vm you want to use the pve-storage pool that we create in the video this is the ceph pool containing the 3 osd devices that correspond to the the 3 drives, 1 in each node. Thank You for watching :)
Nice VIdeo! When you create a new VM which storage you have to select? the dault one or the Ceph ppol? and when you mark a VM for HA, the node will move it to the Ceph pool? thanks for your work!!
Ideally if you create a ceph configuration like this, you should put all your vms in the ceph pool, if you have a vm where the vm drive is not in a ceph pool you need to move the vm drive to the ceph pool before the vm can be HA. Thanks for watching!
i have a 3 node cluster with proxmox 8.2.2 they have public IP assigned three servers in three different locations my problem is I have installed ceph but the public IP on each of the server is different and has different network how do I resolve this and create a the ceph cluster ..and if not how do I create a HA storage accrose these three nodes ..i have tried NFS but it takes lot of time for migration
For ceph you need high bandwith like 10g and low latancy for the replication depending on how taxed the cluster will be, you a network separate from the main network, you would need to have a second public ip for each node just dedicated for ceph, you can also try zfs replication, for multiple locations I would recommend having a cluster of 2 nodes in each locations and then, have a migration plan for each location if you need to move vms around. thank you for watching!
are you installing the cluster on 3 PCs or 2 PC and raspberry? could be different devices? so in case 1 node goes down the low power node is up and running? Ceph looks great to use the same storage on the 3 clusters nice!!
Is recommended to have the same machines with same storage for the nodes, but it can work with different ones, as long as the rest of the nodes can take the load to spin the vms that went down back up again.
@@distrodomain Thanks for reply. detected the following error(s): * this host already contains virtual guests TASK ERROR: Check if node may join a cluster failed! - in this case i have created VM both node even i stopped Vm clster geeting error
@@distrodomain Yes i got different error after configure cluster. Both node have same VM_ID name Like (VM100) in this case cannot allow to run the machine Same ID either any one getting work rest of the not support to run this machine - But my understanding i need to change the VMID both machine and disk
How would this compare to using the another machine for the shared storage running TruNAS or something. Would it be more optimized or less? For example you have 3 HA prox cluster then have another machine for the shared storage (or several if that’s HA as well).
For an hyperconverge setup you cound create 3 iscsi targets with a dedicated vdev for each node, this way linux will see the them as a block device and you can run the ceph setup on those block devices, I have not tried this setup but I'm curious about the outcome, if you only have a 1g network you might encounter performance issues, you'll be limited to 125Mb/s read/write on the vm disks. Thank You for watching :)
I have a TrueNAS as well as ceph, I have dedicated 10Gb links for it too. Nevertheless, the problem with the TrueNAS is that it then becomes another single point of failure, I'd rather use a slower but more resilient ceph than the TrueNAS
In the part of the video were I am setting up de OSDs and the ceph pool that's where we are setting up the replication the 3 500gb drives are part of the pool and the ceph monitors keeps track of the replications. Thank you for watching!
The manager provides the dashboard, it collects and distributes statistic data. They are needed for a good running ceph system, but they are not essential. So if all your managers would go down. Then there would be no immediate problem. You just start another manager on a node that is still up.
I do not have ceph enabled, but I'm using zfs replication with HA enabled. For VM migration, no pings were lost. For the down nose, I get ~3 minutes then I got the ping started responding again. Each of my node is an Intel NUC8. How much overhead does the ceph adds to the PVE vs ZFS replication?
If it's for homelab there is not much diference tho ceph is made more for enterprise depolyment where the biggest difference is that ceph has data redundancy on block or object level where ZFS does redundancy with whole disks. Need more space on ceph; Just add more disks, it will rebalance itself. Need to retire old disks? Just pull them out and the cluster will rebalance itself. Same for adding and removing nodes. Also is recommended with ceph to have high throughput network for rebuilds and rebalancing. I run it on a 1g network and I have had no issues but many users recommend a dedicated 10g switch, and 10g nics. Thank You for watching :)
Nice video man... thanks for sharing!!!🎉🎉🎉🎉
In 14 minutes you have explained and demonstrated better than many 1 hour videos, precise and without many entanglements, thank you very much
You're very welcome, I'm glad you find it useful! Thank You for watching :)
He certainly did
Agree this comment, thanks for the video
Awesome video. Just need a video on cluster recovery now. :)
Yes, I will work on that, thanks for the suggestion!
Finally a video like this. Not too much bla bla but not too fast. in 15 minutes all infos you need
I'm glad it was helpful!, thank you for watching
Instant subscription. Concise pertinent and well paced information.
Thank you - now I’m off to hunt down some more of your content 😁
Welcome aboard! I'm glad you found it useful, thank you for watching
...a big thumbs up from me :) Your video helped me a lot. Others are right, a pill of specific knowledge in 15 minutes. And I don't speak English well and I've been 60 for a month :)
Best regards, Sedece :)
Glad it was helpful! thank you for watching :)
Sorry, I have to add something :)
What I missed in your video was the appropriate network configuration that will enable the best operation of the HA cluster. I had to delve into the documentation. It's not a defect of the video, but I had a problem with data transfer. They recommend 10Gbps or more...
Regards.
great video. Now it is completely clear to configure HA Proxmox VE with CEPH!
Glad to hear, thank you for watching! :)
Just got into Proxmox and was wondering how you go about clustering....etc..... Excellent Video to get me started
I'm glad it was helpful, thank you for watching!
LOL Great Job 👍 Clear without bullsh*t. Thank You v.much.
Glad you liked it!
What’s. Great content. It was what I was looking for. For my home lab thank you so much.
Glad it was helpful, thank you for wtching!
Great video, straight to the point! Liked and Subscribed!
I'm glad it was helpful! thanks for watching :)
Just a note when you tested migration - there was a single ping downtime, MacOS pings miss a seq number (before going to host unreachable) - they dont show timeouts like Windows. Still a great demonstration though.
Yea I realized after the fact, indeed I lost 1 ping during the migration not too shabby tho, thank you for watching! :)
Thank you for this video. My question is why you have taken 3 nodes? Why not 2?
3 nodes is usally the standard for redundancy in producrion enviromnets to more easliy do maintenances on 1 node while offloading all the vms to the other 2, but a 2 node cluster is also possible, thank you for watching!
@@distrodomain Thank you for smart support.
thanks for the explanation & appreciated with the tutorials . I just get confused with the HA , I thought setting up & join cluster the proxmox will have cluster . creating zfs storage in one of the node wont have HA right ? So I need to create OSD drives pool only it will have HA function ? If OSD drive is conventional hard drive lets say 16TB non ssd drive will the performance drop & have issues ? Thanks
HA will only work if you have ceph configured with more than 1 drive, or if you are using zfs replication across multiple drives, for the 16tb drive your vms sill have the performance of 1 drive so around 100MB/s r/w. Thank You for watching :)
Excellent video, clear and concise. 2 questions, can you HA a VM that has a PCI passthru if you have it on several but not all nodes ? And what happen if you have more storage on one node than others in ceph ?
for pcie passthrough I haven't fully tested it but I believe is not supported, ceph will make the volumes as big as your smaller drive that's why you can almost identical drives on each node, thank you for watching!
in my cluster... the VM shuts down and then turns on during the migration. why?
do you have dedicated storage on each node for ceph, what does your setup looks like, thank you for watching! :)
If you shut down the node, the VM has to restart. It's also shown in this video, that Rocky VM is restarting. Only if all nodes are up, you can do live migration without restarting. The reason is the content in the RAM of the machine with the node that's shutting down. It cannot be copied if there is power loss in one node. On a planned shutdown you can give a setting in the last HA part of the video, that you set it to 'migrate', if I remember correctly.
13:35 request state, I think you can set it to 'migrate' there. Not 100% sure. But if I remember correctly, the VMs are auto-migrated before the machine is shutting down.
Yes if the node loses all power all of a sudden the vm needs to restart, the contents of ram would not have time for migrate, but the vm harddrive should be replicated in the other nodes with ceph, so the vm should be able to boot back up. I will do some testings to confirm. Thank You for watching ! :)
I have a question ? can it be possible when a node fail can be transfer just for a specific node? and where i can change that ? thanks for you support on your videos
Yes is possible you can play around with the HA groups this will enable you to spesify a host, thanks for watching!
One of the best demonstrations I've ever seen. Even though I've never worked on Proxmox before, I understood everything you just did. The only drawback is if you use the Community Edition in a disconnected network (no internet connection), how are you going to install Ceph? Can you shed some light on that?
You would need internet to update and install paackages fisrt, if that is not an option then creating a local repo of the proxmox repository and configuring that new repo on the nodes.
Hi, what configuration should I apply to enable the minimum number of available copies to be 1 for a cluster with 2 nodes and a qdevice? Cheers.
great tutorial clear and concise 👍
Thank you for watching! :)
nice work, noob question coming.. in this example you used 3x 480GB discs for your 3 OSD's ... does that give you a total useable pool size of roughly 1.4TB (stripe)? Or Is it giving you the useable total pool size of 1 drive? (i.e. ceph consumes 3 HD's but only 1 HD useable total space)? I ask before doing this on 3x 24-bay servers... want to make sure I have enough useable storage.
It gives the total amount of 1 drive, think about it as a 3 way mirror where the data and vm disks are replicated on the 3 drives, thank you for watching :), let me know if you have any other questions.
@@distrodomain my understanding was that ceph got better / faster / larger with each node. Does adding a 4th 480GB node make it a 4-way mirror? If so how would one ever go about increasing the pool size aside from adding 1 extra hdd to every node? What happens if you only add 1 extra hdd to one node but not the others? thanks again!
you can add additional disks to each node like another 480gb ssd, to have 2 on each node, for this new drive you create an osd the same way and add it to the pool, to keep things consistent I recommend using same amount of drive and capacity in each node.
@Distro Domain: You didn't show the creation of the VM. You showed that the migration is immediate but does not mention why. I suppose that you've placed the disk(s) of it into pve-storage.
When you ceate the vm you want to use the pve-storage pool that we create in the video this is the ceph pool containing the 3 osd devices that correspond to the the 3 drives, 1 in each node. Thank You for watching :)
Nice VIdeo! When you create a new VM which storage you have to select? the dault one or the Ceph ppol? and when you mark a VM for HA, the node will move it to the Ceph pool? thanks for your work!!
Ideally if you create a ceph configuration like this, you should put all your vms in the ceph pool, if you have a vm where the vm drive is not in a ceph pool you need to move the vm drive to the ceph pool before the vm can be HA. Thanks for watching!
i have a 3 node cluster with proxmox 8.2.2 they have public IP assigned three servers in three different locations
my problem is I have installed ceph but the public IP on each of the server is different and has different network how do I resolve this and create a the ceph cluster ..and if not how do I create a HA storage accrose these three nodes ..i have tried NFS but it takes lot of time for migration
For ceph you need high bandwith like 10g and low latancy for the replication depending on how taxed the cluster will be, you a network separate from the main network, you would need to have a second public ip for each node just dedicated for ceph, you can also try zfs replication, for multiple locations I would recommend having a cluster of 2 nodes in each locations and then, have a migration plan for each location if you need to move vms around. thank you for watching!
i have a 3 dell server and config raid in my DELL PERC , i have a problem creating a pool in ceph
What are the issues you are encountering?
are you installing the cluster on 3 PCs or 2 PC and raspberry? could be different devices? so in case 1 node goes down the low power node is up and running? Ceph looks great to use the same storage on the 3 clusters nice!!
Is recommended to have the same machines with same storage for the nodes, but it can work with different ones, as long as the rest of the nodes can take the load to spin the vms that went down back up again.
before cluster config. i have created VM in both node is it possible to create cluster
you shoule be able to create cluster with vm already running, let me know if you get errors, thank you for watching!
@@distrodomain Thanks for reply. detected the following error(s):
* this host already contains virtual guests
TASK ERROR: Check if node may join a cluster failed! - in this case i have created VM both node even i stopped Vm clster geeting error
@@distrodomain Yes i got different error after configure cluster. Both node have same VM_ID name Like (VM100) in this case cannot allow to run the machine Same ID either any one getting work rest of the not support to run this machine - But my understanding i need to change the VMID both machine and disk
How would this compare to using the another machine for the shared storage running TruNAS or something. Would it be more optimized or less? For example you have 3 HA prox cluster then have another machine for the shared storage (or several if that’s HA as well).
For an hyperconverge setup you cound create 3 iscsi targets with a dedicated vdev for each node, this way linux will see the them as a block device and you can run the ceph setup on those block devices, I have not tried this setup but I'm curious about the outcome, if you only have a 1g network you might encounter performance issues, you'll be limited to 125Mb/s read/write on the vm disks. Thank You for watching :)
@@distrodomain Interesting, I will have to try that out. Thanks for the reply
I have a TrueNAS as well as ceph, I have dedicated 10Gb links for it too. Nevertheless, the problem with the TrueNAS is that it then becomes another single point of failure, I'd rather use a slower but more resilient ceph than the TrueNAS
Thanks, great video.
Glad you liked it! Thank you for watching :)
This reminds me Gluster nightmare.
This is a gem.
I'm glad you enjoyed it! thank you for watching
Hi, How do i change ip on node and cluster have work 😂
check under these files, /etc/network/interfaces, /etc/hosts, /etc/pve/corosync.conf, thank you for watching!
How did you replicate the 500GB SSD to each of the three nodes? Thanks for the video.
In the part of the video were I am setting up de OSDs and the ceph pool that's where we are setting up the replication the 3 500gb drives are part of the pool and the ceph monitors keeps track of the replications. Thank you for watching!
@@distrodomain Got it. You have a 500GB SSD drive on each node. Thanks.
@@crc1219 Yes exactly, each node has the same storage 1 256g nvme ssd for the OS, and 1 500g ssd for ceph replication used for storing the VMs.
it dont work for me,
cant install Cluster
At what step are you getting stuck?
great video....thanks 🤗👍
nice video... thnx man
Can this be done with just two nodes?
Yes this can be done with 2 nodes, and then later add nodes and grow your ceph pool. Thank you for watching :)
what happens if ceph manager goes down?
The manager provides the dashboard, it collects and distributes statistic data.
They are needed for a good running ceph system, but they are not essential. So if all your managers would go down. Then there would be no immediate problem. You just start another manager on a node that is still up.
thanks for explain
genius!
Thank You! :)
it working👍thanks
Nice, thank you for watching!
I do not have ceph enabled, but I'm using zfs replication with HA enabled. For VM migration, no pings were lost. For the down nose, I get ~3 minutes then I got the ping started responding again.
Each of my node is an Intel NUC8. How much overhead does the ceph adds to the PVE vs ZFS replication?
If it's for homelab there is not much diference tho ceph is made more for enterprise depolyment where the biggest difference is that ceph has data redundancy on block or object level where ZFS does redundancy with whole disks. Need more space on ceph; Just add more disks, it will rebalance itself. Need to retire old disks? Just pull them out and the cluster will rebalance itself. Same for adding and removing nodes.
Also is recommended with ceph to have high throughput network for rebuilds and rebalancing. I run it on a 1g network and I have had no issues but many users recommend a dedicated 10g switch, and 10g nics. Thank You for watching :)
plain and simple 👍
I'm glad it was helpful! :)