Hello, guys
We are using 3-nodes cluster with recent Proxmox VE (2.2): Node01, Node02 and Node-test for quorum and fence manual (for time-been)
Last night something strange happen and cluster said: "Node01 is down" (in fact node was alive and fully operational :( )
We did fence_ack_manual Node01 (on Node02) in order to use HA, all the VM with HA enabled successfully moved to Node02
But when we restarted Node01 both Node01 and Node02 started rebooting continuously (one by one)
Did we miss something? Should we provide some more commands to cluster saying that Node01 is back? Should we do cluster/fence domain leave for the node before it starts again?
Could you please, provide the accurate guide what one should do in case when a node is failed and recovered afterwards? (in terms of cluster/fencing domain management)
Thanks in advance!
We are using 3-nodes cluster with recent Proxmox VE (2.2): Node01, Node02 and Node-test for quorum and fence manual (for time-been)
Last night something strange happen and cluster said: "Node01 is down" (in fact node was alive and fully operational :( )
We did fence_ack_manual Node01 (on Node02) in order to use HA, all the VM with HA enabled successfully moved to Node02
But when we restarted Node01 both Node01 and Node02 started rebooting continuously (one by one)
Did we miss something? Should we provide some more commands to cluster saying that Node01 is back? Should we do cluster/fence domain leave for the node before it starts again?
Could you please, provide the accurate guide what one should do in case when a node is failed and recovered afterwards? (in terms of cluster/fencing domain management)
Thanks in advance!