DRBD Diskless after 48 hours

Hi all,

I have some problem with my Proxmox cluster and DRBD between it. I set all and it's fine. Cluster working perfect. Migration, backups all VM. Storage i have on LVM on DRBD. Every vgdrbd has 1TB. But after some time i have problem with drbd:

Code:

  

0:r0  Connected Primary/Primary UpToDate/Diskless C r----- lvm-pv: drbdvg0 931.29g 861.00g 

1:r1  Connected Primary/Primary UpToDate/UpToDate C r----- lvm-pv: drbdvg1 931.29g 0g

In dmesg i see:

Code:

block drbd0: Starting worker thread (from cqueue [2626])

block drbd0: open("/dev/sdb1") failed with -16

block drbd0: drbd_bm_resize called with capacity == 0

block drbd0: worker terminated

block drbd0: Terminating worker thread

block drbd1: Starting worker thread (from cqueue [2626])

block drbd1: disk( Diskless -> Attaching ) 

block drbd1: Found 4 transactions (70 active extents) in activity log.

block drbd1: Method to ensure write ordering: barrier

block drbd1: max BIO size = 131072

block drbd1: drbd_bm_resize called with capacity == 1953064672

block drbd1: resync bitmap: bits=244133084 words=3814580 pages=7451

block drbd1: size = 931 GB (976532336 KB)

block drbd1: bitmap READ of 7451 pages took 37 jiffies

block drbd1: recounting of set bits took additional 36 jiffies

block drbd1: 0 KB (0 bits) marked out-of-sync by on disk bit-map.

block drbd1: disk( Attaching -> UpToDate ) 

block drbd1: attached to UUIDs 70A8363B4F73C19E:0000000000000000:43AC9F762F8AF4F7:43AB9F762F8AF4F7

block drbd0: Starting worker thread (from cqueue [2626])

block drbd0: conn( StandAlone -> Unconnected ) 

block drbd0: Starting receiver thread (from drbd0_worker [2661])

block drbd0: receiver (re)started

block drbd0: conn( Unconnected -> WFConnection ) 

block drbd1: conn( StandAlone -> Unconnected ) 

block drbd1: Starting receiver thread (from drbd1_worker [2649])

block drbd1: receiver (re)started

block drbd1: conn( Unconnected -> WFConnection ) 

block drbd0: Handshake successful: Agreed network protocol version 96

block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC

block drbd0: conn( WFConnection -> WFReportParams ) 

block drbd0: Starting asender thread (from drbd0_receiver [2670])

block drbd0: data-integrity-alg: <not-used>

block drbd0: max BIO size = 4096

block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) 

block drbd1: Handshake successful: Agreed network protocol version 96

block drbd1: Peer authenticated using 20 bytes of 'sha1' HMAC

block drbd1: conn( WFConnection -> WFReportParams ) 

block drbd1: Starting asender thread (from drbd1_receiver [2674])

block drbd1: data-integrity-alg: <not-used>

block drbd1: drbd_sync_handshake:

block drbd1: self 70A8363B4F73C19E:0000000000000000:43AC9F762F8AF4F7:43AB9F762F8AF4F7 bits:0 flags:0

block drbd1: peer 7D727C5A8840067D:70A8363B4F73C19F:43AC9F762F8AF4F7:43AB9F762F8AF4F7 bits:0 flags:0

block drbd1: uuid_compare()=-1 by rule 50

block drbd1: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate ) 

block drbd0: role( Secondary -> Primary ) 

block drbd1: role( Secondary -> Primary ) 

DLM (built Oct 14 2013 08:10:28) installed

block drbd1: conn( WFBitMapT -> WFSyncUUID ) 

block drbd1: updated sync uuid 70A9363B4F73C19F:0000000000000000:43AC9F762F8AF4F7:43AB9F762F8AF4F7

block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1

block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)

block drbd1: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) 

block drbd1: Began resync as SyncTarget (will sync 0 KB [0 bits set]).

block drbd1: Resync done (total 1 sec; paused 0 sec; 0 K/sec)

block drbd1: updated UUIDs 7D727C5A8840067D:0000000000000000:70A9363B4F73C19F:70A8363B4F73C19F

block drbd1: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) 

block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1

block drbd1: helper command: /sbin/drbdadm after-resync-target minor-1 exit code 0 (0x0)

block drbd1: bitmap WRITE of 7451 pages took 20 jiffies

block drbd1: 0 KB (0 bits) marked out-of-sync by on disk bit-map.

ip_tables: (C) 2000-2006 Netfilter Core Team

My drbd configuration looks like this:

- global_common.conf

Code:

global {

  usage-count yes;

  # minor-count dialog-refresh disable-ip-verification

}



common {

  protocol C;



  handlers {

    # The following 3 handlers were disabled due to #576511.

    # Please check the DRBD manual and enable them, if they make sense in your setup.

    # pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

    # pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";

    # local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";



    # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";

    # split-brain "/usr/lib/drbd/notify-split-brain.sh root";

    # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";

    # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";

    # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;

  }



  startup {

    # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb

    wfc-timeout 15;

          degr-wfc-timeout 15;

          become-primary-on both;

  }



  disk {

    # on-io-error fencing use-bmbv no-disk-barrier no-disk-flushes

    # no-disk-drain no-md-flushes max-bio-bvecs

  }



  net {

    # sndbuf-size rcvbuf-size timeout connect-int ping-int ping-timeout max-buffers

    # max-epoch-size ko-count allow-two-primaries cram-hmac-alg shared-secret

    # after-sb-0pri after-sb-1pri after-sb-2pri data-integrity-alg no-tcp-cork

          cram-hmac-alg sha1;

          shared-secret "my-secret";

          allow-two-primaries;

          after-sb-0pri discard-zero-changes;

          after-sb-1pri discard-secondary;

          after-sb-2pri disconnect;

  }



  syncer {

    # rate after al-extents use-rle cpu-mask verify-alg csums-alg

    rate 1000M;

  }

}

And resource like this: r0.res

Code:

# This is the resource used for the shared GFS2 partition.

resource r0 {

  # This is the block device path.

  device    /dev/drbd0;



  # We'll use the normal internal metadisk (takes about 32MB/TB)

  meta-disk internal;



  # This is the `uname -n` of the first node

  on node1 {

    # The 'address' has to be the IP, not a hostname. This is the

    # node's SN (bond1) IP. The port number must be unique amoung

    # resources.

    address   10.0.0.12:7788;



    # This is the block device backing this resource on this node.

    disk    /dev/sdb1;

  }

  # Now the same information again for the second node.

  on node2 {

    address   10.0.0.13:7788;

    disk    /dev/sdb1;

  }

}

I have tried a lot of stuff. But i don't have any idea now. What happened and why? Do you have some idea? Maybe disk is broken on servers?

I will be very grateful for help and answer.

Best,
Rafal

DRBD Diskless after 48 hours

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112