Two node cluster - Fencing misbehaivor?

Hello all,

I've been working with Proxmox VE for almost two years now and I decided to go a little further with HA, so we adquiered two Dell PowerEdge and prepare them to be a two-node cluster.

Everything has gone great so far until I apply the fencing rules. I've follow the how-to http://pve.proxmox.com/wiki/Two-Node...bility_Cluster even if it seems to be for the beta version. I've fencing configured with iDRAC6 network cards and with the 'reboot' option, the problem comes when I try to test the enviroment:

1. Two nodes running. I switch off on node that has no machines on it, so far so well second node detects that the node is off.
2. When the failing node boots up it automaticlly reboots the node that's working, leaving all working machines off... good it is not on production yet.
3. The new up-node will keep rebooting the node with the machines forever and so it is impossible to reach it.

I would like for the living node to wait until the other node boots. Am I suppose to delte and readd the node after a failure? I'm sure I am missing something here.

This is my configuration:

Storage

RAID 1 80 GB hahttp://pve.proxmox.com/wiki/Two-Node_High_Availability_Clusterrd disk in which the OS is installed.
RAID 5 1024 GB hard disk configured with DRBD http://pve.proxmox.com/wiki/DRBD
I've no problems with DRBD syncronization and every split-brain has been recovered quite well, I've set up the sync parameter to 110M so it syncronizes faster (they sync over a dedictaed GbE network card directly connected)
(all hardware RAID)
all disks local
NFS 5TB shared storage for backups.
NFS 1024GB shared storage for ISO's.

I have also set up a quorum disk but is not included in the cluster.conf yet, just in case this have something to do with the problem

Network

we have 6 GbE NIC's and the iDRAC dedicated NIC on each one.
The routes are so I can SSH them through a VPN

Code:

root@hypvdell02:~# cat /etc/network/interfaces 

# network interface settings

auto lo

iface lo inet loopback



iface eth0 inet manual



iface eth1 inet manual



iface eth2 inet manual



iface eth3 inet manual



iface eth4 inet static

    address  192.168.0.27

    netmask  255.255.255.0



auto eth5

iface eth5 inet static

    address  10.0.0.23

    netmask  255.255.255.0



auto bond0

iface bond0 inet static

    address  192.168.0.25

    netmask  255.255.255.0

    slaves eth0 eth1 eth2

    bond_miimon 100

    bond_mode 802.3ad



auto vmbr0

iface vmbr0 inet static

    address  192.168.0.23

    netmask  255.255.255.0

    gateway  192.168.0.1

    bridge_ports bond0

    bridge_stp off

    bridge_fd 0

    up route add -net 10.12.0.0 netmask 255.255.255.0 gw 192.168.0.111

    down route del -net 10.12.0.0 netmask 255.255.255.0 gw 192.168.0.111

Cluster config

Code:

root@hypvdell02:~# cat /etc/pve/cluster.conf

<?xml version="1.0"?>

<cluster config_version="24" name="dellHA">

  <cman expected_votes="1" two_node="1"/>

  <fencedevices>

    <fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;" ipaddr="192.168.0.20" login="root" name="fencenode1" passwd="5SVbXsVi58S0w7YEbWOJ" secure="1"/>

    <fencedevice agent="fence_drac5" cmd_prompt="admin1-&gt;" ipaddr="192.168.0.21" login="root" name="fencenode2" passwd="BPITqVvrZLmK8c1=-gT8" secure="1"/>

  </fencedevices>

  <clusternodes>

    <clusternode name="hypvdell1" nodeid="1" votes="1">

      <fence>

        <method name="1">

          <device action="reboot" name="fencenode1"/>

        </method>

      </fence>

    </clusternode>

    <clusternode name="hypvdell02" nodeid="2" votes="1">

      <fence>

        <method name="1">

          <device action="reboot" name="fencenode2"/>

        </method>

      </fence>

    </clusternode>

  </clusternodes>

  <rm>

    <pvevm autostart="1" vmid="100"/>

    <pvevm autostart="1" vmid="101"/>

    <pvevm autostart="1" vmid="501"/>

  </rm>

</cluster>

Additional Info

Code:

root@hypvdell02:~# tail /var/log/cluster/fenced.log

Mar 20 14:40:00 fenced fenced 1352871249 started

Mar 20 14:40:52 fenced fencing node hypvdell1

Mar 20 14:41:02 fenced fence hypvdell1 dev 0.0 agent fence_drac5 result: error from agent

Mar 20 14:41:02 fenced fence hypvdell1 failed

Mar 20 14:41:05 fenced fencing node hypvdell1

Mar 20 14:41:15 fenced fence hypvdell1 dev 0.0 agent fence_drac5 result: error from agent

Mar 20 14:41:15 fenced fence hypvdell1 failed

Mar 20 14:41:18 fenced fencing node hypvdell1

Mar 20 14:41:26 fenced fence hypvdell1 dev 0.0 agent fence_drac5 result: error from agent

Mar 20 14:41:26 fenced fence hypvdell1 failed

Code:

root@hypvdell02:~# tail /var/log/cluster/corosync.log

Mar 20 14:39:56 corosync [CLM   ] Members Left:

Mar 20 14:39:56 corosync [CLM   ] Members Joined:

Mar 20 14:39:56 corosync [CLM   ]     r(0) ip(192.168.0.23) 

Mar 20 14:39:56 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.

Mar 20 14:39:56 corosync [CMAN  ] quorum regained, resuming activity

Mar 20 14:39:56 corosync [QUORUM] This node is within the primary component and will provide service.

Mar 20 14:39:56 corosync [QUORUM] Members[1]: 2

Mar 20 14:39:56 corosync [QUORUM] Members[1]: 2

Mar 20 14:39:56 corosync [CPG   ] chosen downlist: sender r(0) ip(192.168.0.23) ; members(old:0 left:0)

Mar 20 14:39:56 corosync [MAIN  ] Completed service synchronization, ready to provide service.

Please, I'm desperate here, let me know if you need any additional information

Two node cluster - Fencing misbehaivor?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112