Hello,
I'm using Proxmox VE 1.9 in a production environment with nearly 30 OpenVZ containers and one KVM machine. The KVM machine is running the Zimbra Collaboration server. Now I have the problem the Zimbra server is unreachable for nearly two hours every day. In the Syslog of that KVM I found the following errors:
Apr 10 08:28:27 mail2 kernel: [81977.835890] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 10 08:28:27 mail2 kernel: [81977.921766] ata1.00: failed command: READ DMA EXT
Apr 10 08:28:27 mail2 kernel: [81977.921772] ata1.00: cmd 25/00:00:60:22:54/00:02:01:00:00/e0 tag 0 dma 262144 in
Apr 10 08:28:27 mail2 kernel: [81977.921774] res 40/00:00:00:00:00/00:00:00:00:00/e0 Emask 0x4 (timeout)
Apr 10 08:28:27 mail2 kernel: [81977.921777] ata1.00: status: { DRDY }
Apr 10 08:28:27 mail2 kernel: [81977.921848] ata1: soft resetting link
Apr 10 08:28:27 mail2 kernel: [81978.082958] ata1.01: NODEV after polling detection
Apr 10 08:28:27 mail2 kernel: [81978.083682] ata1.00: configured for MWDMA2
Apr 10 08:28:27 mail2 kernel: [81978.083688] ata1.00: device reported invalid CHS sector 0
Apr 10 08:28:27 mail2 kernel: [81978.083714] ata1: EH complete
Apr 10 08:29:39 mail2 kernel: [82050.508549] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Apr 10 08:29:39 mail2 kernel: [82050.593690] ata1.00: BMDMA stat 0x4
Apr 10 08:29:40 mail2 kernel: [82050.677816] ata1.00: failed command: READ DMA EXT
Apr 10 08:29:40 mail2 kernel: [82050.760141] ata1.00: cmd 25/00:00:60:26:54/00:02:01:00:00/e0 tag 0 dma 262144 in
Apr 10 08:29:40 mail2 kernel: [82050.760142] res 41/04:00:60:26:54/04:00:60:26:54/e0 Emask 0x1 (device error)
Apr 10 08:29:40 mail2 kernel: [82051.089726] ata1.00: status: { DRDY ERR }
Apr 10 08:29:40 mail2 kernel: [82051.172676] ata1.00: error: { ABRT }
Apr 10 08:29:40 mail2 kernel: [82051.254731] ata1.00: configured for MWDMA2
Apr 10 08:29:40 mail2 kernel: [82051.254731] ata1: EH complete
This seems to tell me that I have a problem with the hard drive. But the hard drive is virtual. Is this a hint that the real hard drive of the server has problem? The other OpenVZ containers keep running while the KVM machine has that problem!
In the host´s syslog I found corresponding entries:
Apr 10 08:28:28 proxmoxhost kernel: ata1: exception Emask 0x40 SAct 0x0 SErr 0x800 action 0x7
Apr 10 08:28:28 proxmoxhost kernel: ata1: SError: { HostInt }
Apr 10 08:28:28 proxmoxhost kernel: ata1: hard resetting link
Apr 10 08:28:29 proxmoxhost kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 10 08:28:29 proxmoxhost kernel: ata1.00: configured for UDMA/133
Apr 10 08:28:29 proxmoxhost kernel: sd 0:0:0:0: [sda] Unhandled error code
Apr 10 08:28:29 proxmoxhost kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Apr 10 08:28:29 proxmoxhost kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 62 56 28 36 00 01 02 00
Apr 10 08:28:29 proxmoxhost kernel: end_request: I/O error, dev sda, sector 1649813558
Apr 10 08:28:29 proxmoxhost kernel: ata1: EH complete
Should I replace the (a) hard drive? It´s a rented server where I have no physical access.
Thanks in advance
tabbi
I'm using Proxmox VE 1.9 in a production environment with nearly 30 OpenVZ containers and one KVM machine. The KVM machine is running the Zimbra Collaboration server. Now I have the problem the Zimbra server is unreachable for nearly two hours every day. In the Syslog of that KVM I found the following errors:
Quote:
Apr 10 08:28:27 mail2 kernel: [81977.835890] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Apr 10 08:28:27 mail2 kernel: [81977.921766] ata1.00: failed command: READ DMA EXT
Apr 10 08:28:27 mail2 kernel: [81977.921772] ata1.00: cmd 25/00:00:60:22:54/00:02:01:00:00/e0 tag 0 dma 262144 in
Apr 10 08:28:27 mail2 kernel: [81977.921774] res 40/00:00:00:00:00/00:00:00:00:00/e0 Emask 0x4 (timeout)
Apr 10 08:28:27 mail2 kernel: [81977.921777] ata1.00: status: { DRDY }
Apr 10 08:28:27 mail2 kernel: [81977.921848] ata1: soft resetting link
Apr 10 08:28:27 mail2 kernel: [81978.082958] ata1.01: NODEV after polling detection
Apr 10 08:28:27 mail2 kernel: [81978.083682] ata1.00: configured for MWDMA2
Apr 10 08:28:27 mail2 kernel: [81978.083688] ata1.00: device reported invalid CHS sector 0
Apr 10 08:28:27 mail2 kernel: [81978.083714] ata1: EH complete
Apr 10 08:29:39 mail2 kernel: [82050.508549] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Apr 10 08:29:39 mail2 kernel: [82050.593690] ata1.00: BMDMA stat 0x4
Apr 10 08:29:40 mail2 kernel: [82050.677816] ata1.00: failed command: READ DMA EXT
Apr 10 08:29:40 mail2 kernel: [82050.760141] ata1.00: cmd 25/00:00:60:26:54/00:02:01:00:00/e0 tag 0 dma 262144 in
Apr 10 08:29:40 mail2 kernel: [82050.760142] res 41/04:00:60:26:54/04:00:60:26:54/e0 Emask 0x1 (device error)
Apr 10 08:29:40 mail2 kernel: [82051.089726] ata1.00: status: { DRDY ERR }
Apr 10 08:29:40 mail2 kernel: [82051.172676] ata1.00: error: { ABRT }
Apr 10 08:29:40 mail2 kernel: [82051.254731] ata1.00: configured for MWDMA2
Apr 10 08:29:40 mail2 kernel: [82051.254731] ata1: EH complete
In the host´s syslog I found corresponding entries:
Quote:
Apr 10 08:28:28 proxmoxhost kernel: ata1: exception Emask 0x40 SAct 0x0 SErr 0x800 action 0x7
Apr 10 08:28:28 proxmoxhost kernel: ata1: SError: { HostInt }
Apr 10 08:28:28 proxmoxhost kernel: ata1: hard resetting link
Apr 10 08:28:29 proxmoxhost kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 10 08:28:29 proxmoxhost kernel: ata1.00: configured for UDMA/133
Apr 10 08:28:29 proxmoxhost kernel: sd 0:0:0:0: [sda] Unhandled error code
Apr 10 08:28:29 proxmoxhost kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Apr 10 08:28:29 proxmoxhost kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 62 56 28 36 00 01 02 00
Apr 10 08:28:29 proxmoxhost kernel: end_request: I/O error, dev sda, sector 1649813558
Apr 10 08:28:29 proxmoxhost kernel: ata1: EH complete
Thanks in advance
tabbi