Hostname: abcvsdrdc12v019.naeast.ad.abcdee.com
ESXi Version: ESXi 5.5.0 build-9919047
HBA on Server:
vmhba |
driver |
driver version |
VID |
DID |
SVID |
SDID |
model |
vmhba1 |
fnic |
1.6.0.34 |
1137 |
45 |
1137 |
0084 |
Cisco Systems Inc Cisco VIC FCoE HBA Driver |
vmhba2 |
fnic |
1.6.0.34 |
1137 |
45 |
1137 |
0084 |
Cisco Systems Inc Cisco VIC FCoE HBA Driver |
vmhba3 |
fnic |
1.6.0.34 |
1137 |
45 |
1137 |
004f |
Cisco Systems Inc Cisco VIC FCoE HBA Driver |
vmhba4 |
fnic |
1.6.0.34 |
1137 |
45 |
1137 |
004f |
Cisco Systems Inc Cisco VIC FCoE HBA Driver |
- From the VMWare HCL we can see that there are latest firmware version available:
Model : |
UCS VIC1240 |
|
|
Device Type : |
FCoE CNAs |
DID : |
0045 |
Brand Name : |
Cisco |
SVID : |
1137 |
Number of Ports: |
4 |
SSID : |
0084 |
VID : |
1137 |
|
|
Device Driver(s) |
Firmware Version |
Additional Firmware Version |
ESXi 5.5 U3 |
enic version 2.3.0.6 |
4.1(1a) |
ESXi 5.5 U3 |
enic version 2.1.2.69 |
4.0(3a) |
ESXi 5.5 U3 |
enic version 2.1.2.62 |
4.0(1C) |
ESXi 5.5 U3 |
enic version 2.1.2.59 |
2.2(2c) |
ESXi 5.5 U3 |
enic version 2.1.2.50 |
2.2(1c) |
ESXi 5.5 U3 |
enic version 2.1.2.42 |
2.1(3a) |
ESXi 5.5 U3 |
enic version 2.1.2.38 |
2.1(3g) |
ESXi 5.5 U3 |
fnic version 1.6.0.37 |
4.0(1j) |
ESXi 5.5 U3 |
fnic version 1.6.0.36 |
4.0(1j) |
ESXi 5.5 U3 |
fnic version 1.6.0.34 |
4.0(1j) |
Model : |
UCS VIC1280 |
||
Device Type |
FCoE CNAs |
DID : |
0045 |
Brand Name : |
Cisco |
SVID : |
1137 |
Number of |
8 |
SSID : |
004f |
VID : |
1137 |
Device Driver(s) |
Firmware Version |
Additional Firmware Version |
ESXi 5.5 U3 |
enic version 2.3.0.6 |
4.1(1a) |
ESXi 5.5 U3 |
enic version 2.1.2.69 |
4.0(3a) |
ESXi 5.5 U3 |
enic version 2.1.2.62 |
4.0(1C) |
ESXi 5.5 U3 |
enic version 2.1.2.59 |
2.2(2c) |
ESXi 5.5 U3 |
enic version 2.1.2.50 |
2.2(1c) |
ESXi 5.5 U3 |
enic version 2.1.2.42 |
2.1(3a) |
ESXi 5.5 U3 |
enic version 2.1.2.38 |
2.1(3g) |
ESXi 5.5 U3 |
fnic version 1.6.0.37 |
4.0(1j) |
ESXi 5.5 U3 |
fnic version 1.6.0.36 |
4.0(1j) |
ESXi 5.5 U3 |
fnic version 1.6.0.34 |
4.0(1j) |
Vmware Logs:
- From the logs we can see that the VM has undergone a vMotion on 2nd June.
2019-06-02T02:46:08.042Z| vmx| W110: VMX has left the building: 0.
2019-06-02T15:20:48.563Z| vmx| I120: Hostname=abcvsdrdc12v019.naeast.ad.abcdee.com
- Found the following passthrough disks:
2019-06-02T15:20:54.457Z| Worker#0| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/disk0.vmdk' (0xa): vmfs, 167772160 sectors / 80 GB.
2019-06-02T15:20:54.457Z| Worker#1| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/disk1.vmdk' (0xa): vmfs, 503316480 sectors / 240 GB.
2019-06-02T15:20:54.459Z| Worker#2| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_20.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 4026531840 sectors / 1.9 TB.
2019-06-02T15:20:54.464Z| vmx| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_22.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 754974720 sectors / 360 GB.
2019-06-02T15:20:54.465Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_21.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 754974720 sectors / 360 GB.
2019-06-02T15:20:54.475Z| Worker#2| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_23.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 12583680 sectors / 6 GB.
2019-06-02T15:20:54.477Z| Worker#0| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_24.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 754974720 sectors / 360 GB.
2019-06-02T15:20:54.477Z| vmx| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_30.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 754974720 sectors / 360 GB.
2019-06-02T15:20:54.477Z| Worker#1| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_31.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 754974720 sectors / 360 GB.
2019-06-02T15:20:54.478Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_32.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 12583680 sectors / 6 GB.
2019-06-02T15:20:54.484Z| Worker#2| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_33.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 12583680 sectors / 6 GB.
2019-06-02T15:20:54.486Z| vmx| I120: DISKLIB-LINK : Opened '/vmfs/volumes/57c847be-04a294ad-bdde-0025b5492185/abcsn00009721/abcsn00009721_34.vmdk' (0x4a): vmfsPassthroughRawDeviceMap, 754974720 sectors / 360 GB.
VMKernel:
- From the Vmkernel Logs we can see the below issue with the
2019-06-04T00:40:58.295Z cpu63:33229)<3>fnic : 4 :: hdr status = FCPIO_DATA_CNT_MISMATCH
2019-06-03T21:13:51.850Z cpu18:35443)ScsiDeviceIO: 2370: Cmd(0x412f04662b40) 0xfe, CmdSN 0x162f1 from world 32853 to dev "naa.60000970000197200803533030301234" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2019-06-03T21:13:51.923Z cpu22:32856)NMP: nmp_ThrottleLogForDevice:2457: Cmd 0xf1 (0x413301f40300, 32853) to dev "naa.60000970000197200803533030301234" on path "vmhba4:C0:T0:L3" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0xe 0x1d 0x0. Act:NONE
2019-06-03T21:13:51.923Z cpu22:32856)ScsiDeviceIO: 2331: Cmd(0x41330196eb00) 0xfe, CmdSN 0x162f2 from world 32853 to dev "naa.60000970000197200803533030301234" failed H:0x0 D:0x2 P:0x5 Possible sense data: 0x0 0x0 0x0.
2019-06-03T21:13:51.923Z cpu22:32856)HBX: 1932: ATS Miscompare detected between test and set HB images at offset 4063232 on vol '57c847be-04a294ad-bdde-0025b5492185'.
2019-06-03T21:13:51.923Z cpu62:90447)NMP: nmp_ThrottleLogForDevice:2457: Cmd 0x8a (0x413b01a8de80, 90406) to dev "naa.60000970000197200803533030301234" on path "vmhba4:C0:T0:L3" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2019-06-03T21:13:51.923Z cpu62:90447)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60000970000197200803533030301234" state in doubt; requested fast path state update...
2019-06-04T00:29:02.456Z cpu60:90447)NMP: nmp_ThrottleLogForDevice:2457: Cmd 0x2a (0x413b01b71a80, 90406) to dev "naa.60000970000197200717533030320000" on path "vmhba4:C0:T1:L15" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2019-06-04T00:29:02.456Z cpu60:90447)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60000970000197200717533030320000" state in doubt; requested fast path state update...
2019-06-04T00:29:02.456Z cpu60:90447)ScsiDeviceIO: 2331: Cmd(0x413b01b71a80) 0x2a, CmdSN 0x800e0044 from world 90406 to dev "naa.60000970000197200717533030320000" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2019-06-04T00:29:20.121Z cpu15:33907)NMP: nmp_ResetDeviceLogThrottling:3317: last error status from device naa.60000970000197200803533030301234 repeated 3 times
2019-06-04T00:29:26.536Z cpu78:34645)NMP: nmp_ThrottleLogForDevice:2457: Cmd 0x28 (0x413b001f8340, 90406) to dev "naa.60000970000197200717533030320000" on path "vmhba4:C0:T1:L15" Failed: H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2019-06-04T00:29:26.536Z cpu78:34645)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60000970000197200717533030320000" state in doubt; requested fast path state update...
- Based on the errors we can see that these messages appears as the HBA driver aborts the command because the command took longer than expected:
- For more information: https://kb.vmware.com/s/article/1022026
VOBD:
From the VOBD logs we can see that there is a Power-On Reset that is happening on nearly all the Datastores which are connect to the VM as a Passthrough disk:
2019-06-03T17:08:58.516Z: [scsiCorrelator] 111038980516us:
[vob.scsi.scsipath.por] Power-on Reset occurred on vmhba2:C0:T0:L20
2019-06-03T17:08:58.520Z: [scsiCorrelator]
111038984015us: [vob.scsi.scsipath.por] Power-on Reset occurred on
vmhba1:C0:T0:L20
- Power on Reset on RDM Disk:
2019-06-02T22:10:09.112Z:
[scsiCorrelator] 42711167309us: [vob.scsi.scsipath.por] Power-on Reset occurred
on naa.60000970000197200717533030314633
2019-06-02T22:10:09.116Z: [scsiCorrelator] 42711171586us:
[vob.scsi.scsipath.por] Power-on Reset occurred on
naa.60000970000197200717533030323142
2019-06-02T22:10:09.120Z: [scsiCorrelator] 42711175549us:
[vob.scsi.scsipath.por] Power-on Reset occurred on naa.60000970000197200717533030320000
2019-06-02T22:10:09.122Z: [scsiCorrelator] 42711177278us:
[vob.scsi.scsipath.por] Power-on Reset occurred on
naa.60000970000197200717533030323233
2019-06-02T22:10:09.126Z: [scsiCorrelator] 42711181591us:
[vob.scsi.scsipath.por] Power-on Reset occurred on
naa.60000970000197200717533030323138
2019-06-02T22:10:09.594Z: [scsiCorrelator] 42711649875us:
[vob.scsi.scsipath.por] Power-on Reset occurred on
naa.60000970000197200717533030323141
2019-06-02T22:10:09.597Z: [scsiCorrelator] 42711652785us: [vob.scsi.scsipath.por]
Power-on Reset occurred on naa.60000970000197200717533030323143
2019-06-02T22:10:09.602Z: [scsiCorrelator] 42711657125us:
[vob.scsi.scsipath.por] Power-on Reset occurred on
naa.60000970000197200717533030323230
2019-06-02T22:10:09.605Z: [scsiCorrelator] 42711660647us:
[vob.scsi.scsipath.por] Power-on Reset occurred on
naa.60000970000197200717533030323242
2019-06-02T22:10:09.608Z: [scsiCorrelator] 42711663779us:
[vob.scsi.scsipath.por] Power-on Reset occurred on
naa.60000970000197200717533030323139
Conclusion:
- Based on the logs we conclude that the faulty device could be the HBA or the Storage itself. However based on the below Events:
2019-06-04T00:40:58.295Z cpu63:33229)<3>fnic : 4 :: hdr status = FCPIO_DATA_CNT_MISMATCH
2019-06-04T00:40:58.295Z cpu18:428382)WARNING: NMP:
nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.60000970000197200803533030301234"
state in doubt; requested fast path state update...
2019-06-04T00:40:58.295Z cpu18:428382)ScsiDeviceIO: 2370:
Cmd(0x412f03cc3c40) 0x28, CmdSN 0x1834a from world 428516 to dev "naa.60000970000197200803533030301234"
failed H:0x7 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
- I can find a known issue which points towards an issue with CISCO drivers: https://kb.vmware.com/s/article/2149522
- Please check with the CISCO to either update the HBA driver or resolve the above issue.
- Please confirm if you are using EMC VMAX 10K
- As they have an ATS issue going on with Microcode version: 5874-5875.
- Check with the EMC Team regarding this issue and update the new Microcode.