Host Name: esx-icpmcc00305
Datastore Name | naa / t10 / eui ID | Volume UUID | LUN ID | Medium | Number Path | Vendor |
ICP_VSI_****_009_SI_125b_0d77_N_G | eui.65a29ccb3448125bfb4d0d770*****04 | 598ab822-989f5084-04ca-246e961e97f0 | LUN:4 | fc | 1 | EMC Fibre Channel |
VOBD:
- From the VOBD Logs we can see that the issue started with Power-On Reset started happening on EUI.65a29ccb3448125bfb4d0d770*****04.
2019-02-13T02:16:38.187Z: [scsiCorrelator] 11662981439480us: [vob.scsi.scsipath.por] Power-on Reset occurred on eui.65a29ccb3448125bfb4d0d770*****04
2019-02-13T02:18:24.372Z: [scsiCorrelator] 11663087623696us: [vob.scsi.scsipath.por] Power-on Reset occurred on eui.65a29ccb3448125bfb4d0d770*****04
2019-02-13T02:18:24.373Z: [scsiCorrelator] 11663127568230us: [esx.problem.storage.connectivity.devicepor] Frequent PowerOn Reset Unit Attentions are occurring on device eui.65a29ccb3448125bfb4d0d770*****04. This may indicate a storage problem. Affected datastores: “ICP_VSI_****_009_SI_125b_0d77_N_G”.
VMKernel:
- At the time Mentioned we are only able to see 1 Event
2019-02-13T12:36:00.270Z cpu21:33716)ScsiDeviceIO: 2354: Cmd(0x412eccb81340) 0x85, CmdSN 0x4c2c from world 34501 to dev “naa.61866da06e702c00210c7ebc*****c59” failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
- From the VMKernel logs we can see the issue with the IO-Error after which a power-on reset happened:
2019-02-13T02:16:38.187Z cpu36:33502)WARNING: [11662981441] IO-ERROR comb: 295f00020015. offsetInComb 352. SizeInLB 1. SDS_ID 116891ed00000002. Comb Gen 29. Head Gen 9e87.
2019-02-13T02:16:38.187Z cpu36:33502)WARNING: Vol ID 0xfb4d0d7700000004. Last fault Status IO_FAULT_NOT_PRI(12).Last error Status IO_FAULT_DEVICE_ERROR(362) Reason (old SCSI gen without retry) Retry count (0) chan (2)
2019-02-13T02:16:38.187Z cpu36:33502)scini: blkScsi_PrintIOInfo:3274: ScaleIO R2_01:hCmd 0x412ecd626440, OpCode 0x89, rc 51 scsiStat 2, senseCode 6, asc 41, ascq 2
2019-02-13T02:16:38.187Z cpu36:33502)NMP: nmp_ThrottleLogForDevice:2457: Cmd 0x89 (0x412ecd626440, 32813) to dev “eui.65a29ccb3448125bfb4d0d770*****04” on path “vmhba32:C0:T4:L4” Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x2. Act:NONE
2019-02-13T02:16:38.187Z cpu36:33502)ScsiDeviceIO: 2331: Cmd(0x412ecd626440) 0x89, CmdSN 0x8c5a3b from world 32813 to dev “eui.65a29ccb3448125bfb4d0d770*****04” failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x2.
2019-02-13T02:16:38.187Z cpu36:33502)ScsiCore: 1486: Power-on Reset occurred on eui.65a29ccb3448125bfb4d0d770*****04
- Unable to find any events on Host name : esx-icpmcc00314 and esx-icpmccd00648 at the time mentioned.
Conclusion:
- At the time mentioned in the last email which is 12:20 am GMT we were not able to find any events associated to Esxi.
- However if we see the events around 2:00 AM we can say that SAN might become heavily congested, which can cause I/O requests to take a long time to complete. The virtual machine storage performance is degraded.
Recommendation:
Because the messages are caused by activity on the array outside of the ESX host’s scope, the storage administrator should use storage-specific tools to isolate the cause of the problem.
For More information please refer: https://kb.vmware.com/kb/1020702