Hostname: abcdcc2v026.naeast.ad.abcde.com
Esxi Version: VMware ESXi 5.5.0 build-9919047
Scratch location: /tmp/scratch :: This location is not ideal, during a host reboot the
logs will not be saved, see KB 1033696
CIM Logs:
- From the CIM logs we can see that there is an Issue with the Host
Hardware itself.
[Wed Jun 5 03:25:38 UTC 2019] Dumping instances of
OMC_RawIpmiSensor
Description = Disk or Disk
Bay 1 HDD1_INFO: Drive Fault
Description = Disk or
Disk Bay 1 HDD1_INFO: In Critical Array
Description = Disk or
Disk Bay 1 HDD1_INFO: In Failed Array
Description = Disk or
Disk Bay 0 HDD0_INFO: Drive Fault
Description = Disk or
Disk Bay 0 HDD0_INFO: In Critical Array
Description = Disk or
Disk Bay 0 HDD0_INFO: In Failed Array
- As per the article: https://kb.vmware.com/s/article/2069475.
Please get an Extensive Hardware diagnostics done.
VMK Summary:
- From the VMKernel logs we can see that the
host has rebooted and since there is no Persistent logs location present
we cannot find the reason for the Failure.
2019-06-05T01:34:07Z
bootstop: Host has booted
2019-06-05T02:00:01Z heartbeat: up 0d0h29m53s, 0 VMs; [[36076 fdm 13184kB]
[35320 vpxa-worker 21900kB] [34944 hostd-worker 44492kB]] [[36800
sfcb-vmware_raw 6%max] [36215 sfcb-vmware_bas 14%max] [36209 sfcb-pycim
17%max]]
2019-06-05T03:00:01Z heartbeat: up 0d1h29m53s, 0 VMs; [[36076 fdm 13184kB]
[35320 vpxa-worker 23092kB] [34944 hostd-worker 46104kB]] [[36800
sfcb-vmware_raw 6%max] [36215 sfcb-vmware_bas 14%max] [36209 sfcb-pycim
17%max]]
Hostd:
- From the
logs we can see that the Services got started around this time .
2019-06-05T01:32:33.285Z [FFBDB9A0 info ‘Default’] BEGIN SERVICES
- Since
there was no scratch partition configured we are getting the below error:
2019-06-05T01:33:33.463Z [272C2B70 warning ‘Hostsvc.VmkVprobSource’]
Argument ‘1’ for vprob ‘esx.problem.scratch.partition.unconfigured’ not found
2019-06-05T01:33:33.463Z [27C81B70 info ‘Libs’ opID=hostd-93af] CPU[11]:
MSR 0xce = 0xc0064011600
2019-06-05T01:33:33.463Z
[272C2B70 warning ‘Hostsvc.VmkVprobSource’] Wrong argument count for vprob
‘esx.problem.scratch.partition.unconfigured’, expected: 0, got: 1
VOBD:
- VOBD Logs are showing Power-On Reset which is being triggered:
2019-06-05T01:31:02.044Z:
[scsiCorrelator] 63403606us: [vob.scsi.scsipath.por] Power-on Reset occurred on
vmhba4:C0:T1:L17
2019-06-05T01:31:02.060Z: [scsiCorrelator]
63419374us: [vob.scsi.scsipath.por] Power-on Reset occurred on vmhba3:C0:T1:L17
2019-06-05T01:31:02.077Z: [scsiCorrelator]
63436596us: [vob.scsi.scsipath.por] Power-on Reset occurred on vmhba2:C0:T0:L17
2019-06-05T01:31:02.136Z: [scsiCorrelator]
63494942us: [vob.scsi.scsipath.por] Power-on Reset occurred on vmhba1:C0:T0:L17
- As per the Article: https://kb.vmware.com/s/article/1020702
The SAN might become heavily congested, which can cause I/O
requests to take a long time to complete. However this could be due to
the reason that Esxi Host has rebooted.
Conclusion:
- Based on the information that is present on the Host I will
recommend you to check the Hardware Event if you can find any errors
associated with the Hardware and get an Extensive Hardware Diagnostics
done.