I'm experiencing some weird tcp issues between 2 openvz containers on my Proxmox cluster
The containers are 64bit ubuntu 12.04, what I'm seeing is the following
Both containers are running mysql master-master replication, container A is running on proxmox node A, container B is running on proxmox node B. Tcp communication is via a bridge interface, both containers are in the same network (10.100.1.11 and 10.100.1.12).
When I simply start the containers, all goes well, mysql is started and I do some benchmarking tests (doing mysql inserts to see if replication works).
Then I issue the slave stop command and slave start command to stop / start replication
Then both nodes can not connect to each other anymore, I get mysql errors that connection is lost
Both containers can still ping to each other (so no problem with ip connectivity).
I can also not ssh to the container anymore, telnet to port 22 on the container fails, ip connectivity is ok (ping works), sshd is running
ping 10.100.1.11
PING 10.100.1.11 (10.100.1.11): 56 data bytes
64 bytes from 10.100.1.11: icmp_seq=0 ttl=63 time=97.436 ms
64 bytes from 10.100.1.11: icmp_seq=1 ttl=63 time=54.724 ms
frank$ telnet 10.100.1.11 22
Trying 10.100.1.11...
frank$ ssh root@10.100.1.11
ssh: connect to host 10.100.1.11 port 22: Connection refused
When I wait long enough the problem goes away (without doing anything, no mysql restart, no container reboot, ...). The mysql slaves reconnect again because they are trying every minute and my ssh session works again
frank$ telnet 10.100.1.11 22
Trying 10.100.1.11...
Connected to 10.100.1.11.
Escape character is '^]'.
SSH-2.0-OpenSSH_5.9p1 Debian-5ubuntu1
frank$ ssh root@10.100.1.11
root@10.100.1.11's password:
Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 2.6.32-18-pve x86_64)
I also took some wireshark traces and I can see a tcp SYN from a to b but b answers with a RST, TCP handshake is never established
The containers are 64bit ubuntu 12.04, what I'm seeing is the following
Both containers are running mysql master-master replication, container A is running on proxmox node A, container B is running on proxmox node B. Tcp communication is via a bridge interface, both containers are in the same network (10.100.1.11 and 10.100.1.12).
When I simply start the containers, all goes well, mysql is started and I do some benchmarking tests (doing mysql inserts to see if replication works).
Then I issue the slave stop command and slave start command to stop / start replication
Then both nodes can not connect to each other anymore, I get mysql errors that connection is lost
Both containers can still ping to each other (so no problem with ip connectivity).
I can also not ssh to the container anymore, telnet to port 22 on the container fails, ip connectivity is ok (ping works), sshd is running
ping 10.100.1.11
PING 10.100.1.11 (10.100.1.11): 56 data bytes
64 bytes from 10.100.1.11: icmp_seq=0 ttl=63 time=97.436 ms
64 bytes from 10.100.1.11: icmp_seq=1 ttl=63 time=54.724 ms
frank$ telnet 10.100.1.11 22
Trying 10.100.1.11...
frank$ ssh root@10.100.1.11
ssh: connect to host 10.100.1.11 port 22: Connection refused
When I wait long enough the problem goes away (without doing anything, no mysql restart, no container reboot, ...). The mysql slaves reconnect again because they are trying every minute and my ssh session works again
frank$ telnet 10.100.1.11 22
Trying 10.100.1.11...
Connected to 10.100.1.11.
Escape character is '^]'.
SSH-2.0-OpenSSH_5.9p1 Debian-5ubuntu1
frank$ ssh root@10.100.1.11
root@10.100.1.11's password:
Welcome to Ubuntu 12.04.2 LTS (GNU/Linux 2.6.32-18-pve x86_64)
I also took some wireshark traces and I can see a tcp SYN from a to b but b answers with a RST, TCP handshake is never established