Quantcast
Channel: Proxmox Support Forum
Viewing all 171654 articles
Browse latest View live

best way to backup whole node "system" partitions (boot and lvm)?

$
0
0
HI,

I would like to be able to restore a pve node exactly how it is now, taking a full backup (boot and lvm partitions), to an external nas space.
I would prefer a simple way, like booting from sysresccd or another live cd.
I know a few methods (partimage, fsarchiver, dd..) that are easy to use, but I am in doubt for the LVM partition, never used those tools on such partition type.

[edit]
I just saw that booting with sysrescd fsarchiver probe finds dm-0, dm-1 and dm-2, which are root/data/swap pve partitions.

I could probably backup those dm-x "partitions", but if I need to rebuild the node, how should I proceed?
Is there anything documented?
[/edit]

Any suggestions?
Thanks

Marco

Cannot start VM with vRAM >128G

$
0
0
Said VM does start with 32, 64, 96 GB vRAM but not with 128GB or higher. Here is the output that I'm getting:

Code:

TASK ERROR: start failed: command '/usr/bin/kvm -id 100 -chardev 'socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -vnc unix:/var/run/qemu-server/100.vnc,x509,password -pidfile /var/run/qemu-server/100.pid -daemonize -smbios 'type=1,uuid=68782ec6-2866-4bfb-88e5-2e4fca313166' -name geostorage -smp '12,sockets=1,cores=12,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000' -vga cirrus -cpu kvm64,+lahf_lm,+x2apic,+sep -m 131072 -k en-us -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'vfio-pci,host=03:00.0,id=hostpci0,bus=pci.0,addr=0x10' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:f83ee4efa04' -drive 'if=none,id=drive-ide2,media=cdrom,aio=native' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -drive 'file=/var/lib/vz/images/100/vm-100-disk-1.qcow2,if=none,id=drive-virtio0,format=qcow2,aio=native,cache=none,detect-zeroes=on' -device 'virtio-blk-pci,drive=drive-virtio0,id=virtio0,bus=pci.0,addr=0xa,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=76:FE:6D:A4:B0:92,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300'' failed: got timeout

VMBR1 Dropping Internet Connection

$
0
0
I added another NIC so I am able to have a Windows VM with a "floating" (change the IP address of Windows to have access to physical computers on different subnets instead of making changes to Proxmox and reboot each time) subnet.

While downloading updates for the new VM, the Internet was dropped and wouldn't come back up until I reboot. I tried to download updates again only to have the same thing happen. I decided to just do some browsing the Internet for and about 10 minutes later, the connection dropped.

My /etc/network/interfaces look like this:

Code:

# network interface settings
auto lo
iface lo inet loopback

iface eth0 inet manual

iface eth1 inet manual

iface eth2 inet manual

auto eth3
iface eth3 inet manual

auto bond0
iface bond0 inet static
        address  192.168.3.2
        netmask  255.255.255.0
        slaves eth1 eth2
        bond_miimon 100
        bond_mode 802.3ad

auto vmbr0
iface vmbr0 inet static
        address  192.168.2.2
        netmask  255.255.255.0
        gateway  192.168.1.1
        bridge_ports eth0
        bridge_stp off
        bridge_fd 0

auto vmbr1
iface vmbr1 inet manual
        bridge_ports eth3
        bridge_stp off
        bridge_fd 0

proxmox asus eeebox j1900 installation aborted

$
0
0
I apologize for any mistake on where to ask or request for support if this is not the way or proper place to place this post but I need some support with installing proxmox v3.4.

scenario:
1) pc EeeBox EB1036-B0534 intel j1900
2) installing from usb flash deive

Problem:
installation start normally and after detecting network (which is mark with a done tag, after a couple of seconds), the next line is

\nInstallation aborted - unable to continue ....

there is not log under tmp or any other information, it also how a previous error earlier in the process: ERROR could not insert 'video': unknown symbol in module...

however if I install in debugging mode and enter the vga=normal parameter and then hit ctrl-d it seems to overcome that video problem ( I don't really know, I am assuming because it continues with starting hotplug event ... and other two lines with green OKs at the beginning), but after detecting network as I mention above it give me the \nInstallation aborted specified above.

any help?

[HOW-TO] Separate migration network - dirty fix

$
0
0
Hi all,

On the forum I saw topics about questions how to let Proxmox Cluster use a different network for migration traffic.
Currently for all my clusters I made a code change in the QemuServer.pm to change the listening IP (unfortunately hard-coded) of the migration task.

Currently I have 5 interfaces per hypervisor:
  • NIC1 and NIC2 (2 x Gigabit Ethernet) are configured in bonding only for public network traffic - no ip address configured - connect to two different gigabit switches.
  • NIC3 (1 x Gigabit Ethernet) is configured as management traffic. Only for internal traffic - ip address: 10.0.10.XX - connected to gigabit switch.
  • NIC4 and NIC5 (2 x 10Gigabit Ethernet) are configured in bonding only for storage traffic of DRBD, use of NFS etc - ip address: 10.0.7.XX - connected to a 10G switch.


I know this is not the neatest way. Please note that if you update the Proxmox to a newer version, the changes will be lost.

HOW-TO:
  • First check the IP of network you would like to use for the migration network. In my case I use the storage network also for my live migration network (ip: 10.0.7.48 as example).
  • Open the following file in your favourite file editor: /usr/share/perl5/PVE/QemuServer.pm
  • Search in the file for the variable: $migrate_uri
    The result will show the following: $migrate_uri = "tcp:${localip}:${migrate_port}";
  • By replacing the "${localip}" with the IP you would like to use, Proxmox will be forced to listen on that ip address everytime he receives a migration request.
    The result will be as following after the replacement (in my example with use of ip 10.0.7.48): $migrate_uri = "tcp:10.0.7.48:${migrate_port}";
  • Save the file and make sure it is saved by running the command:
    root@hypervisor48:~# cat /usr/share/perl5/PVE/QemuServer.pm | grep migrate_uri
    my $migrate_uri;
    $migrate_uri = "tcp:10.0.7.48:${migrate_port}";
    push @$cmd, '-incoming', $migrate_uri;
    print "migration listens on $migrate_uri\n" if $migrate_uri;

  • Restart the pvedaemon on the server by typing the following:
    root@hypervisor48:~# service pvedaemon restart
    Restarting PVE Daemon: pvedaemon.

  • Do the above steps on every node which are listening on that same network.
  • Test the changes by doing a live migration of a VM. You will see if the changes have worked in the sentence: starting online/live migration on 10.0.7.48:PORT


If you have any questions, please do not hesitate to contact me.

[SOLVED] HELP! Lost root password!!!

$
0
0
Hello. I was smart enough to lose my root password for Proxmox. I've already tried to bootup a live cd with Debian and mount on of the drives, but that didnt work probably because im using a zfs volume in raid 1. I've also tried to modify grub by adding init=/bin/bash so it boots in single user mode. That dosent work either because i cant write the '=' sign because i dont have a American keyboard layout but a danish one. So is there anything else i can do other than reinstalling?

Thank you in advance :D

IP of the VM or VZ on Datacerver config

$
0
0
I can configure the IP addresses of the KVM OVZ directly on Proxmox?
On my server (proximox) i have this /etc/network/interfaces:

Code:

auto lo
iface lo inet loopback


# device: eth0
auto  eth0
iface eth0 inet static
  address  176.9.18.135
  broadcast 176.9.18.159
  netmask  255.255.255.224
  gateway  176.9.18.129
  # default route to access subnet
  up route add -net 176.9.18.128 netmask 255.255.255.224 gw 176.9.18.129 eth0


iface eth0 inet6 static
  address 2a01:4f8:150:128d::2
  netmask 64
  gateway fe80::1


auto vmbr0
iface vmbr0 inet static
        address  176.9.204.54
        netmask  255.255.255.248
        gateway  176.9.18.135
        bridge_ports none
        bridge_stp off
        bridge_fd 0
up ip route add 176.9.204.48/29 dev vmbr0

On virtual machines use an IP subnet 176.9.204.48/29
But if I change servers and IP addresses with other, I have to configure each virtual machines?
I want to manage the IP of the VM directly proximox, it's possible?











Today I used the repo no subscription.
I want to activate the subscription, if I find the right product for me.











There is a tutorial to configure the firewall from the web interface?

Thanks.

Very high load of the node

$
0
0
Hello,
Yesterday I have upgraded to PVE 3.4 from 3.2 and today I have big problems. The load of the node suddenly grew at 12:00. I can’t find the cause. Some servers hang with the next messages:
gw.pngload1.pngload2.png

Code:

# iotop -d 10 -P
Total DISK READ:      21.17 K/s | Total DISK WRITE:      2.82 M/s
  PID  PRIO  USER    DISK READ  DISK WRITE  SWAPIN    IO>    COMMAND
10287 be/4 root        0.00 B/s    0.00 B/s  0.00 % 13.75 % kvm -id 104
 8691 be/4 root        6.33 K/s    8.61 K/s  0.00 % 13.54 % kvm -id 140
 9633 be/4 root        0.00 B/s    0.00 B/s  0.00 % 12.79 % kvm -id 111
10059 be/4 root        0.00 B/s    0.00 B/s  0.00 % 10.82 % kvm -id 156
 8895 be/4 root        0.00 B/s    0.00 B/s  0.00 %  7.17 % kvm -id 117
 9178 be/4 root        0.00 B/s    0.00 B/s  0.00 %  5.01 % kvm -id 119
 9277 be/4 root      405.13 B/s  44.31 K/s  0.00 %  2.08 % kvm -id 108
 7534 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.27 % [txg_sync]
10858 be/4 root      229.47 K/s  752.69 K/s  0.00 %  0.02 % kvm -id 113
12155 be/4 root      30.46 K/s  329.96 K/s  0.00 %  0.01 % kvm -id 116
 8423 be/4 root      810.26 B/s  25.72 K/s  0.00 %  0.01 % kvm -id 112
10481 be/4 root        0.00 B/s    4.35 K/s  0.00 %  0.01 % kvm -id 106
 1083 be/3 root        0.00 B/s 1215.38 B/s  0.00 %  0.00 % [jbd2/dm-0-8]
 2554 be/3 root        0.00 B/s 1620.51 B/s  0.00 %  0.00 % [jbd2/sda4-8]
10076 be/4 root        0.00 B/s    9.50 K/s  0.00 %  0.00 % kvm -id 110
30156 be/4 root        0.00 B/s  11.87 K/s  0.00 %  0.00 % kvm -id 109
29999 be/4 root      405.13 B/s  130.16 K/s  0.00 %  0.00 % kvm -id 153
 9801 be/4 root        0.00 B/s    4.75 K/s  0.00 %  0.00 % kvm -id 131
65437 be/4 root        0.00 B/s    2.77 K/s  0.00 %  0.00 % kvm -id 124
64509 be/4 root        0.00 B/s    3.17 K/s  0.00 %  0.00 % kvm -id 102
11121 be/4 root      810.26 B/s    5.54 K/s  0.00 %  0.00 % kvm -id 130
 9751 be/4 root        0.00 B/s    2.77 K/s  0.00 %  0.00 % kvm -id 129
11169 be/4 root      405.13 B/s  405.13 B/s  0.00 %  0.00 % kvm -id 127

Code:

# zpool iostat -v 10
...
              capacity    operations    bandwidth
pool        alloc  free  read  write  read  write
----------  -----  -----  -----  -----  -----  -----
pool2      2.20T  799G      0    66  32.7K  3.09M
  pve-csv2  2.20T  799G      0    66  32.7K  3.09M
cache          -      -      -      -      -      -
  sdb      55.9G  7.62M      0      1  21.0K  256K
----------  -----  -----  -----  -----  -----  -----

Code:

# iostat -d -x 10
...
Device:        rrqm/s  wrqm/s    r/s    w/s    rkB/s    wkB/s avgrq-sz avgqu-sz  await r_await w_await  svctm  %util
sda              0.00    1.30    1.10  237.30    5.40  3124.20    26.26    0.02    0.07    6.36    0.04  0.06  1.33
sdc              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
sdb              0.00    0.00    2.10    0.00    75.75    0.00    72.14    0.00    0.67    0.67    0.00  0.67  0.14
dm-0              0.00    0.00    0.20    1.70    1.20    13.60    15.58    0.00    1.05  10.00    0.00  1.05  0.20
dm-1              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
dm-2              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
dm-3              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
dm-4              0.00    0.00    0.10  98.30    0.60  2755.00    56.01    0.00    0.05    7.00    0.04  0.05  0.45

Code:

# top
top - 16:11:27 up 1 day,  2:07,  2 users,  load average: 4.94, 5.70, 6.35
Tasks: 1087 total,  1 running, 1086 sleeping,  0 stopped,  0 zombie
%Cpu(s):  2.5 us,  1.4 sy,  0.0 ni, 95.7 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem:    128840 total,    76732 used,    52107 free,      95 buffers
MiB Swap:    65535 total,        0 used,    65535 free,    4892 cached
 
  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
 12155 root      20  0 5157m 3.5g 4116 S    16  2.8  12:09.68 kvm
 29999 root      20  0 9231m 7.8g 3900 S    15  6.2  49:53.35 kvm
  9801 root      20  0 4852m 4.1g 3960 S    15  3.3 465:40.06 kvm
 64509 root      20  0 10.8g  10g 3972 S    10  8.0 245:27.25 kvm
 11169 root      20  0 1406m 1.0g 3772 S    8  0.8 114:50.26 kvm
  8423 root      20  0 3676m 3.1g 3808 S    6  2.5 113:30.77 kvm
 10858 root      20  0 9313m 5.2g 3788 S    5  4.2  89:14.78 kvm

Code:

# pveversion --verbose
proxmox-ve-2.6.32: 3.3-147 (running kernel: 3.10.0-1-pve)
pve-manager: 3.4-1 (running version: 3.4-1/3f2d890e)
pve-kernel-3.10.0-1-pve: 3.10.0-5
pve-kernel-2.6.32-28-pve: 2.6.32-124
pve-kernel-2.6.32-37-pve: 2.6.32-147
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-16
qemu-server: 3.3-20
pve-firmware: 1.1-3
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-31
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.1-12
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

Code:

# pveperf
CPU BOGOMIPS:      110201.04
REGEX/SECOND:      921999
HD SIZE:          62.87 GB (/dev/mapper/pve-root)
BUFFERED READS:    499.85 MB/sec
AVERAGE SEEK TIME: 9.07 ms
FSYNCS/SECOND:    4271.21

Code:

# pveperf /pool2/VMs/images/
CPU BOGOMIPS:      110201.04
REGEX/SECOND:      942748
HD SIZE:          2970.82 GB (pool2/VMs)
FSYNCS/SECOND:    4683.57

Code:

# pveperf /mnt/sda4/images/
CPU BOGOMIPS:      110201.04
REGEX/SECOND:      923666
HD SIZE:          3023.67 GB (/dev/sda4)
BUFFERED READS:    349.94 MB/sec
AVERAGE SEEK TIME: 9.98 ms
FSYNCS/SECOND:    2474.56

Code:

# qm list | grep -v stopped
      VMID NAME                STATUS    MEM(MB)    BOOTDISK(GB) PID
      102 server102            running    10240            50.00 64509
      104 server104            running    512                2.00 10287
      106 server106            running    2048            300.00 10481
      108 server108            running    2048              4.00 9277
      109 server109            running    4096              50.00 30156
      110 server110            running    1024            150.00 100765
      111 server111            running    2048            100.00 9633
      112 server112            running    3072              32.00 8423
      113 server113            running    8192              48.00 10858
      115 server115            running    1024              50.00 10631
      116 server116            running    4096              32.00 12155
      117 server117            running    2048              4.00 8895
      119 server119            running    1024              16.00 9178
      122 server122            running    1024              50.00 10779
      124 server124            running    3072              40.00 65437
      127 server127            running    1024              30.00 11169
      128 server128            running    2048              8.00 11283
      129 server129            running    4096              50.00 9751
      130 server130            running    1024              50.00 11121
      131 server131            running    4096              40.00 9801
      140 server140            running    1024            150.00 8691
      153 server153            running    8192            150.00 29999
      156 server156            running    2048            150.00 10059

Thanks.
Attached Images

VNC console error

$
0
0
Hi.
I'm experiencing a lot of problems on the VNC console.
If I try to open the VNC console of a KVM virtual machine I get this error:

Failed to connect to server (code: 1006).

This happens on every PVE nodes and from Chrome, Safari and Firefox, and on all of my virtual machines.
The virtual machine is running, of course.

From the tasks log I see the following:

Code:

TASK ERROR: command '/bin/nc -l -p 5900 -w 10 -c '/usr/bin/ssh -T -o BatchMode=yes 192.168.60.1 /usr/sbin/qm vncproxy 101 2>/dev/null'' failed: exit code 255


If I execute this command from the console and try to telnet to the port 5900 of the node the connection works:

Code:

root@node1:~# /bin/nc -l -p 5900 -w 10 -c '/usr/bin/ssh -T -o BatchMode=yes 192.168.60.1 /usr/sbin/qm vncproxy 101'
MyClient:~ mattia$ telnet 192.168.60.1 5900
Trying 192.168.60.1...
Connected to 192.168.60.1.
Escape character is '^]'.
RFB 003.008
[/code]

My PVE cluster is updated:

Code:

root@node1:~# pveversion -v
proxmox-ve-2.6.32: 3.4-150 (running kernel: 2.6.32-37-pve)
pve-manager: 3.4-3 (running version: 3.4-3/2fc72fee)
pve-kernel-2.6.32-37-pve: 2.6.32-150
pve-kernel-2.6.32-26-pve: 2.6.32-114
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-16
qemu-server: 3.4-3
pve-firmware: 1.1-4
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-32
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.2-8
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1

Could you help me please?

No Quorum after Update

$
0
0
Hello,

After updating my 2 proxmox node, i restarted the first one. Now i wanted to move the vms to the first node and restart the second one.

Unfortunately the quorum ist lost and i cannot move any vms.

I tried to restart the services pvedaemon, pvestatd, pveproxy, pve-cluster

Is it ok, if i just set expected votes to 1 and then migrate the vms? Will that work?

Here are some outputs from the 2 nodes, that might help:

Node1
Code:

root@groemer01 ~ $ pvecm node
Node Sts Inc Joined Name
1 M 1464 2015-03-23 08:12:29 groemer01
2 X 1468 groemer02
root@groemer01 ~ $ pvecm status
Version: 6.2.0
Config Version: 22
Cluster Name: tsccluster
Cluster Id: 49332
Cluster Member: Yes
Cluster Generation: 1472
Membership state: Cluster-Member
Nodes: 1
Expected votes: 2
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 6
Flags:
Ports Bound: 0 177
Node name: groemer01
Node ID: 1
Multicast addresses: x.x.x.x
Node addresses: x.x.x.x
root@groemer01 ~ $ cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="22" name="tsccluster">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi1" passwd="xxx" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi2" passwd="xxx" p ower_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="groemer01" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="ipmi1"/>
</method>
</fence>
</clusternode>
<clusternode name="groemer02" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="ipmi2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="106"/>
</rm>
</cluster>
root@groemer01 ~ $ cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="22" name="tsccluster">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi1" passwd="xxx" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi2" passwd="xxx" p ower_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="groemer01" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="ipmi1"/>
</method>
</fence>
</clusternode>
<clusternode name="groemer02" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="ipmi2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="106"/>
</rm>
</cluster>

Node2
Code:

root@groemer02:~# pvecm status
Version: 6.2.0
Config Version: 22
Cluster Name: tsccluster
Cluster Id: 49332
Cluster Member: Yes
Cluster Generation: 1472
Membership state: Cluster-Member
Nodes: 1
Expected votes: 2
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 6
Flags:
Ports Bound: 0
Node name: groemer02
Node ID: 2
Multicast addresses: x.x.x.x
Node addresses: x.x.x.x
root@groemer02:~# cat /etc/pve/cluster.conf
<?xml version="1.0"?>
<cluster config_version="22" name="tsccluster">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi1" passwd="xxx" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi2" passwd="xxx" power_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="groemer01" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="ipmi1"/>
</method>
</fence>
</clusternode>
<clusternode name="groemer02" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="ipmi2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="106"/>
</rm>
</cluster>
root@groemer02:~# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="22" name="tsccluster">
<cman keyfile="/var/lib/pve-cluster/corosync.authkey"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi1" passwd="xxx" power_wait="5"/>
<fencedevice agent="fence_ipmilan" ipaddr="x.x.x.x" lanplus="1" login="ADMIN" name="ipmi2" passwd="xxx" power_wait="5"/>
</fencedevices>
<clusternodes>
<clusternode name="groemer01" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="ipmi1"/>
</method>
</fence>
</clusternode>
<clusternode name="groemer02" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="ipmi2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<rm>
<pvevm autostart="1" vmid="106"/>
</rm>
</cluster>

omping works as well.

Ceph - Bad performance with small IO

$
0
0
Hello everyone,

first of all I want to say thank you to each and everyone in this community!
I've been a long time reader ( and user of pve ) and could get so much valuable information from this forum!

Right now the deployment of the Ceph Cluster gives me some trouble.
We were using DRBD but since we are expanding and the are more nodes in the pve-cluster we decided to switch to Ceph.

The 3 Ceph-Server-Nodes are connected via a 6*GbE-LACP-Bond with Jumbo-Frames over two stacked switches and the Ceph traffic is on a seperate VLAN.
Currently there are 9 OSDs (3*15K SAS with BBWC per host).
The journal is 10GB per OSD and on LVM-Volumes of a SSD-RAID1.
pg_num and pgp_num are set to 512 for the pool.
Replication is 3 and the CRUSH-Map is configured to distribute the requests over the 3 hosts.

The performance of the rados benchmarks is good:
rados -p test bench 60 write -t 8 --no-cleanup
Code:

Total time run:        60.187142
Total writes made:      1689
Write size:            4194304
Bandwidth (MB/sec):    112.250

Stddev Bandwidth:      48.3496
Max bandwidth (MB/sec): 176
Min bandwidth (MB/sec): 0
Average Latency:        0.28505
Stddev Latency:        0.236462
Max latency:            1.91126
Min latency:            0.053685

rados -p test bench 60 seq -t 8
Code:

Total time run:        30.164931
Total reads made:      1689
Read size:            4194304
Bandwidth (MB/sec):    223.969

Average Latency:      0.142613
Max latency:          2.78286
Min latency:          0.003772

rados -p test bench 60 rand -t 8
Code:

Total time run:        60.287489
Total reads made:      4524
Read size:            4194304
Bandwidth (MB/sec):    300.162

Average Latency:      0.106474
Max latency:          0.768564
Min latency:          0.003791

What makes me wonder is the "Min bandwidth (MB/sec): 0" and "Max latency: 1.91126" at write - benchmark.

I've modified the Linux autotuning TCP buffer limits and the rx/tx ring parameters of the Network-Cards (all Intel), which increased the bandwidth, but didn't help with the latency of small IO.

For example in a wheezy-kvm-guest:
Code:

dd if=/dev/zero of=/tmp/test bs=512 count=1000 oflag=direct,dsync
512000 Bytes (512 kB) kopiert, 9,99445 s, 51,2 kB/s

dd if=/dev/zero of=/tmp/test bs=4k count=1000 oflag=direct,dsync
4096000 Bytes (4,1 MB) kopiert, 10,0949 s, 406 kB/s

I also did put flashcache in front of the OSDs but this didn't help much and since there's 1GB of Cache from the RAID-Controller in front of the OSDs I wonder why this is so slow in the guests?
Compared to the raw performance of the SSDs and the OSDs this is realy bad...
Code:

dd if=/dev/zero of=/var/lib/ceph/osd/ceph-2/test bs=512 count=1000 oflag=direct,dsync
512000 Bytes (512 kB) kopiert, 0,120224 s, 4,3 MB/s

dd if=/dev/zero of=/var/lib/ceph/osd/ceph-2/test bs=4k count=1000 oflag=direct,dsync
4096000 Bytes (4,1 MB) kopiert, 0,137924 s, 29,7 MB/s


dd if=/dev/zero of=/mnt/ssd-test/test bs=512 count=1000 oflag=direct,dsync
512000 Bytes (512 kB) kopiert, 0,147097 s, 3,5 MB/s

dd if=/dev/zero of=/mnt/ssd-test/test bs=4k count=1000 oflag=direct,dsync
4096000 Bytes (4,1 MB) kopiert, 0,235434 s, 17,4 MB/s

Running fio from a node directly via rbd gives expected results, but also with some serious deviations:
Code:

rbd_iodepth32: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=32
fio-2.2.3-1-gaad9
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/13271KB/0KB /s] [0/3317/0 iops] [eta 00m:00s]
rbd_iodepth32: (groupid=0, jobs=1): err= 0: pid=849098: Mon Mar 23 20:08:25 2015
  write: io=2048.0MB, bw=12955KB/s, iops=3238, runt=161874msec
    slat (usec): min=37, max=27268, avg=222.48, stdev=326.17
    clat (usec): min=13, max=544666, avg=7937.85, stdev=11891.77
    lat (msec): min=1, max=544, avg= 8.16, stdev=11.88

Thanks for reading so far :-)
I know this is my first post, but I have really run out of options here and would really appreciate your help.

My question are:
Why is the performance in the guests so much worse?
What can we do to enhance this for Linux as well as Windows guests?

Thanks for reading this big post and I hope we can have a nice discussion with a good outcome for everyone, since this is, in my point of view a common issue for a few users.

ProxMox 3.4 on Blade HP C7000 with fiber Storage

$
0
0
Hi,
I have HP Blade system C7000 with 16 bays(every bay has 8RAM and 2x Quad core CPU , 1x hdd 72G 10k)
In the first two bays i have raid controler , Those bays are connected via optical fiber with a storage (14xHDD10k- 2x fiber output A nad B output)

At bays 3-15 i run ProxMox 3.4. (zfs raid 0)

I have 2 questions:

1)What OS i shall use at the first 2 bays so that the other bays(3-16 with ProxMox) would recognise them as storage..?

2)should i do cluster or HA or ceph.. Do you have a good setup to recoment??

Proxmox 3.4, OVS and Open VZ - 2 nics, 2 subnets

$
0
0
Hello,

I tried a while ago to get this working with Proxmox 3.3 but could never get things working right. I've just installed 3.4 and was hoping someone had a good example of using OVS and OpenVZ with two nics.

I have two nics, each with their own subnet and gateway, would like to be able to connect to the GUI/SSH on the host, and don't want the two subnets to have access to each other as this will be handled by the upstream firewall.

Would it be best to have an additional NIC for the GUI/SSH? How do I properly setup OVS for this type of configuration? Any help would be appreciated as I'd like to migrate to Proxmox soon.

Thanks,
Mike

Turnkey OpenVPN in a container, can't make it work.

$
0
0
I hope someone can give me a hand on this.
I installed the OpenVPN turnkey in a container, configuring it as server and bridged network. I can access the web manager and ssh, but whenever I start the container, the 1194 port appears closed to my requests (before opened but not responding). And nothing seems to work for having it available.

Can someone please give a hand. Or sending me to a good guide on installing openvpn in proxmox?
Thanks in advance.

ZFS over iSCSI doesn't work

$
0
0
Hi,

I'am trying to setup a ZFS over iSCSI configuration on Proxmox 3.4.
I created the ZFS pool on the server (Ubuntu 12.04) and the iSCSI target (IET).
Config files:
IET:
Code:

Target iqn.2014-10.proxmoxhoz.server:proxmoxzfs01
    Lun 0 Path=/dev/zvol/zfspool01,Type=blockio

I'am sure this is not 100% correct because:
Code:

iscsi_trgt: blockio_open_path(167) Can't open device /dev/zvol/zfspool01, error -15
kernel: [61709.931807] iscsi_trgt: blockio_attach(294) Error attaching Lun 0 to Target iqn.2014-10.proxmoxhoz.server:proxmoxzfs01
ietd: unable to create logical unit 0 in target 3: 15

Anywa i realized this error later after tried to find out what could be the problem.

Befora that i tried to create a KVM machine.
storage.cfg
Code:

zfs:
ZFS01
blocksize 4k
target iqn.2014-10.proxmoxhoz.server:proxmoxzfs01
pool zfspool01
iscsiprovider iet
portal 172.18.99.199
content images
nowritecache

So, when i want to create a new KVM machine it dies with this error message:
Code:

TASK ERROR: create failed - 137: Parse error [    Lun 0 Path=/zfspool01] at /usr/share/perl5/PVE/Storage/LunCmd/Iet.pm line 175.
or (mostly)
TASK ERROR: create failed - 138: Parse error [  ] at /usr/share/perl5/PVE/Storage/LunCmd/Iet.pm line 189.

When i check my zfs, the disk is created.

Are you have any idea what could cause the problem?

Thanks, Robert

The difference between Quorum and fencing

$
0
0
Hello,

Can you please explain me the difference between quorum and fencing ?
Can I configure a cluster Proxmox V3.4 with 2 nodes (HA) and VM Quorum without Fencing?
To explain my configuration
  • I have configure cluster proxmox with 2 nodes + one quorum
  • When i try to simulate a fail network with service networking stop, just to verfy if my VM will migrate to the other node. I have this error: fence xxxxxx dev 0.0 agent none result: error no method
  • I just activate the fencing in the /etc/default/redhat-cluster-pve


In the case of, the fencing must be configured to ensure the HA, can I used the same network interface for fencing and to access to the node?

Thanks for your help ;)

Conversion of vmdk to qcow2. Is this correct?

$
0
0
Hello,

I am trying to convert a vmdk disk file from vmware to proxmox
I used the details on this link (https://pve.proxmox.com/wiki/Migrati...x_VE_.28KVM.29 ) however I am not sure if the conversion was correct.
I used the command: qemu-img convert -f vmdk original.vmdk -O qcow2 vm-108-disk-1.qcow2

After the conversion was done I run: qemu-img info and it shows:

image: vm-108-disk-1.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 3.3G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false


Also the command file gives:
qcow2 vm-108-disk-1.qcow2: QEMU QCOW Image (unknown version)


Do those seem to be OK or not?

Any help is welcome! :rolleyes:

Ceph - osd "Problem" - on two nodes are the same osdid´s

$
0
0
Hi to all!

I have a three node cluster, with a ceph storage. It works perfect now.
There are three Servers, every server has 8 harddisks. 1 for Proxmox, 7 for ceph storage.
Now my problem. Normally, there is only one osdid name for the whole three node cluster. I changed in the past harddrives, formated them right and now there are two osds with the same osdid, but only on pve2 it is shown at the Proxmox GUI.
On pve1 and pve2 there is the same osdid 9:

HTML Code:

root@pve1:~# df -h
Filesystem                    Size  Used Avail Use% Mounted on
udev                            10M    0  10M  0% /dev
tmpfs                          1.4G  492K  1.4G  1% /run
/dev/mapper/pve-root            34G  1.5G  31G  5% /
tmpfs                          5.0M    0  5.0M  0% /run/lock
tmpfs                          2.8G  59M  2.7G  3% /run/shm
/dev/mapper/pve-data            73G  180M  73G  1% /var/lib/vz
/dev/fuse                      30M  24K  30M  1% /etc/pve
/dev/cciss/c0d6p1              132G  48G  85G  36% /var/lib/ceph/osd/ceph-9
/dev/cciss/c0d2p1              132G  30G  103G  23% /var/lib/ceph/osd/ceph-1
/dev/cciss/c0d4p1              132G  25G  108G  19% /var/lib/ceph/osd/ceph-7
/dev/cciss/c0d5p1              132G  21G  112G  16% /var/lib/ceph/osd/ceph-8
/dev/cciss/c0d7p1              132G  29G  104G  22% /var/lib/ceph/osd/ceph-10
/dev/cciss/c0d3p1              132G  24G  108G  19% /var/lib/ceph/osd/ceph-2
/dev/cciss/c0d1p1              132G  23G  109G  18% /var/lib/ceph/osd/ceph-0


root@pve2:~# df -h
Filesystem                    Size  Used Avail Use% Mounted on
udev                            10M    0  10M  0% /dev
tmpfs                        1000M  492K  999M  1% /run
/dev/mapper/pve-root            34G  2.0G  30G  7% /
tmpfs                          5.0M    0  5.0M  0% /run/lock
tmpfs                          2.0G  59M  1.9G  3% /run/shm
/dev/mapper/pve-data            77G  180M  77G  1% /var/lib/vz
/dev/fuse                      30M  24K  30M  1% /etc/pve
/dev/cciss/c0d5p1              132G  30G  102G  23% /var/lib/ceph/osd/ceph-12
/dev/cciss/c0d3p1              132G  22G  111G  17% /var/lib/ceph/osd/ceph-6
/dev/cciss/c0d6p1              132G  18G  115G  13% /var/lib/ceph/osd/ceph-9
/dev/cciss/c0d7p1              132G  20G  112G  16% /var/lib/ceph/osd/ceph-13
/dev/cciss/c0d2p1              132G  20G  112G  15% /var/lib/ceph/osd/ceph-4
/dev/cciss/c0d4p1              132G  23G  109G  18% /var/lib/ceph/osd/ceph-11
/dev/cciss/c0d1p1              132G  19G  114G  14% /var/lib/ceph/osd/ceph-3


How can fix that problem to use the harddrivedisk at pve1 (osdid9)?

The command (on pve1) "pveceph destroy osd 9" don´t work:

HTML Code:

root@pve1:~# pveceph destroyosd 9
osd is in use (in == 1)
root@pve1:~#

Did anyone had the same "problem" in the past?

Thanks in advance,

roman

1501 length packages - problem with MTU on virtual pfSense (Proxmox)

$
0
0
Disclaimer: I posted this on the pfSense boards, too, as I don't know wether this is more a Proxmox or pfSense issue

Hello all,

I'm running into some strange problems with too large packets on our WAN interface.

Setup:

- pfSense 2.2 64Bit on Proxmox 3.4 host, 2 cores, 4GB RAM, CPU max 5%
- HW NIC eth1 => WAN, MTU 1500
- HW NIC eth4 = > LAN, MTU 9000
- HW NIC eth2 => LAN, connected to same switch, but not active
- vmbr0, OVS Bridge => eth4 => LAN
- vmbr1, OVS Bridge => eth1 => WAN
- Jumbo Frames on switches enabled
- pfSense MTU WAN If.: 1500
- Clear invalid DF bits instead of dropping the packets: Enabled
- Disable hardware checksum offload: Enabled
- Disable hardware TCP segmentation offload: Enabled
- Disable hardware large receive offload: Enabled
- All other local if's on 9000 MTU
- Storage cluster (Synology): 9000 MTU
- VMs on all proxmox hosts: Default MTU 1500

Log on Proxmox hosts tells me:

Code:

...
Mar 24 18:40:46 vmhost1 kernel: __ratelimit: 6 callbacks suppressed
Mar 24 18:40:46 vmhost1 kernel: openvswitch: tap108i7: dropped over-mtu packet: 1501 > 1500
Mar 24 18:40:46 vmhost1 kernel: openvswitch: tap108i7: dropped over-mtu packet: 1501 > 1500
Mar 24 18:40:46 vmhost1 kernel: openvswitch: tap108i7: dropped over-mtu packet: 1501 > 1500
Mar 24 18:40:46 vmhost1 kernel: openvswitch: tap108i7: dropped over-mtu packet: 1501 > 1500
Mar 24 18:40:46 vmhost1 kernel: openvswitch: tap108i7: dropped over-mtu packet: 1501 > 1500
Mar 24 18:40:46 vmhost1 kernel: openvswitch: tap108i7: dropped over-mtu packet: 1501 > 1500
...

tap108i7 is the OVS bridge on the Proxmox host for WAN If. (vtnet7).

I did some package capturing showing that large packets on the WAN interface come from an virtual IP, i.e. inside the network:

Code:

Id = 12
Source = 217.76.xxx.xx
Destination = 7x.x.x.xxx
Captured Length = 1506
Packet Length = 1506
Protocol = TCP
Date Received = 2015-03-24 17:28:54 +0000
Time Delta = 0.00888514518737793
Information = HTTP -> 58826 ([ACK], Seq=4188548632, Ack=3381854676, Win=243)

The source IP is a public IP from our public pool currently NATing to a VM on another proxmox host on the same network.
Destination is some random public IP (not ours).

Any ideas why these large packages are beeing generated? Where do they come from? How do I stop them?

The VMs "behind" the pfSense are on multiple vlans, each having their own DHCP server. The VLANs are created on the switches and assigned to the pfSense's virtual NICs. Should I set the VMs MTU to 9000, too, as they are on the local networks (the public IP's are NATed on the pfSense and not directly connected to the VM)?

Thanks
Sebastian

Some help new to proxmox nat

$
0
0
hi there, new to this scheeme, how can we join a new proxmox node that is behind nat ex 10.1.0.163 with a public 190.210.XXX.XXX ip to an other node that is in another datacenter with a public ip. when we run in the new node pvcem add 190.210.XXX.XXX it stops at quorum . any idea ? thanks
Viewing all 171654 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>