Quantcast
Channel: Proxmox Support Forum
Viewing all articles
Browse latest Browse all 170561

Migration to new 2.2 nodes fails

$
0
0
I had three Proxmox 2.x servers. I don't recall the exact version, just that they were NOT the latest.

These servers worked well--specifically we were able to migrate virtual machines between them.

Yesterday we added two new servers to the cluster. We did so by updating the software on the three existing servers, then installing the latest ISO on the two new servers and then joining them.

Now when I try to migrate virtual machines anywhere, I get the following in the web console:

Jan 04 11:31:58 starting migration of VM 102 to node 'kvm4' (10.47.0.184)
Jan 04 11:31:58 copying disk images
Jan 04 11:31:58 starting VM 102 on remote node 'kvm4'
Jan 04 11:32:00 starting migration tunnel
Jan 04 11:32:00 starting online/live migration on port 60000
Jan 04 11:32:05 ERROR: online migrate failure - aborting
Jan 04 11:32:05 aborting phase 2 - cleanup resources
Jan 04 11:32:06 ERROR: migration finished with problems (duration 00:00:08)
TASK ERROR: migration problems




And I get the following in the syslog on the source server:

Jan 4 11:31:58 kvm1 pvedaemon[322615]: <root@pam> starting task UPID:kvm1:00056ED4:2F760A0D:50E72E2E:qmigrate:102: root@pam:
Jan 4 11:31:59 kvm1 pmxcfs[322552]: [status] notice: received log
Jan 4 11:32:00 kvm1 pmxcfs[322552]: [status] notice: received log
Jan 4 11:32:00 kvm1 pvedaemon[356052]: VM 102 qmp command failed - VM 102 qmp command 'migrate-set-capabilities' failed - The command migrate-set-cap
abilities has not been found
Jan 4 11:32:00 kvm1 pvedaemon[356052]: VM 102 qmp command failed - VM 102 qmp command 'migrate-set-cache-size' failed - The command migrate-set-cache
-size has not been found
Jan 4 11:32:03 kvm1 pvedaemon[356052]: VM 102 qmp command failed - VM 102 qmp command 'migrate' failed - got timeout
Jan 4 11:32:06 kvm1 pmxcfs[322552]: [status] notice: received log
Jan 4 11:32:06 kvm1 pmxcfs[322552]: [status] notice: received log
Jan 4 11:32:06 kvm1 pvedaemon[356052]: migration problems
Jan 4 11:32:06 kvm1 pvedaemon[322615]: <root@pam> end task UPID:kvm1:00056ED4:2F760A0D:50E72E2E:qmigrate:102: root@pam: migration problems




And I get the following in the syslog on the destination server:
Jan 4 11:31:59 kvm4 qm[40058]: <root@pam> starting task UPID:kvm4:00009C7B:0065BC75:50E72E2F:qmstart:102:r oot@pam:
Jan 4 11:31:59 kvm4 qm[40059]: start VM 102: UPID:kvm4:00009C7B:0065BC75:50E72E2F:qmstart:102:r oot@pam:
Jan 4 11:31:59 kvm4 kernel: device tap102i0 entered promiscuous mode
Jan 4 11:31:59 kvm4 kernel: vmbr0: port 2(tap102i0) entering forwarding state
Jan 4 11:31:59 kvm4 kernel: device tap102i1 entered promiscuous mode
Jan 4 11:31:59 kvm4 kernel: vmbr6: port 2(tap102i1) entering forwarding state
Jan 4 11:32:00 kvm4 qm[40059]: VM 102 qmp command failed - unable to find configuration file for VM 102 - no such machine
Jan 4 11:32:00 kvm4 qm[40059]: VM 102 qmp command failed - unable to find configuration file for VM 102 - no such machine
Jan 4 11:32:00 kvm4 qm[40058]: <root@pam> end task UPID:kvm4:00009C7B:0065BC75:50E72E2F:qmstart:102:r oot@pam: OK
Jan 4 11:32:00 kvm4 kernel: vmbr0: port 2(tap102i0) entering disabled state
Jan 4 11:32:00 kvm4 kernel: vmbr0: port 2(tap102i0) entering disabled state
Jan 4 11:32:00 kvm4 kernel: vmbr6: port 2(tap102i1) entering disabled state
Jan 4 11:32:00 kvm4 kernel: vmbr6: port 2(tap102i1) entering disabled state
Jan 4 11:32:06 kvm4 qm[40109]: <root@pam> starting task UPID:kvm4:00009CAE:0065BF3F:50E72E36:qmstop:102:ro ot@pam:
Jan 4 11:32:06 kvm4 qm[40110]: stop VM 102: UPID:kvm4:00009CAE:0065BF3F:50E72E36:qmstop:102:ro ot@pam:
Jan 4 11:32:06 kvm4 qm[40109]: <root@pam> end task UPID:kvm4:00009CAE:0065BF3F:50E72E36:qmstop:102:ro ot@pam: OK
Jan 4 11:32:06 kvm4 pmxcfs[2010]: [status] notice: received log



I did some un-educated digging and noticed there is nothing in /etc/qemu-server which I believe is populated by the clustering software...?

Looking through /var/log/cluster/*.log I noticed the fence_na.log shows:

Node Assassin: . [].
TCP Port: ...... [238].
Node: .......... [00].
Login: ......... [].
Password: ...... [].
Action: ........ [metadata].
Version Request: [no].
Done reading args.
Connection to Node Assassin: [] failed.
Error was: [unknown remote host: ]
Username and/or password invalid. Did you use the command line switches properly?

I also noticed the 'rgmanager' service shows 'stopped' on all my servers. Hitting the start button (or running /etc/init.d/rgmanager start) appears to do nothing.

Any pointers on what I should try or look for next?

Viewing all articles
Browse latest Browse all 170561

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>