Issue Description:
Virtual Machines not Responding on Cluster Name: GP-PROD-CLUSTER Running a copy of Microsoft Windows Server 2016 Datacenter Version 10.0.14393 Build 14393 when we are creating a Production checkpoints.
Issue reproduced on 13th
_____________________________________________________________________________
System Information: ARES
OS Name Microsoft Windows Server 2016 Datacenter
Version 10.0.14393 Build 14393
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name ARES
System Manufacturer Dell Inc.
System Model PowerEdge R630
System Type x64-based PC
System SKU SKU=NotProvided;ModelName=PowerEdge R630
Processor Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, 2601 Mhz, 14 Core(s), 28 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, 2601 Mhz, 14 Core(s), 28 Logical Processor(s)
BIOS Version/Date Dell Inc. 2.4.3, 17/01/2017
System Events:
- Got the following error around 12:00 AM as the VM was not able to register itself as it was not able to find the VM config File.
Date | Time | Type/Level | Computer Name | Event Code | Source | Description |
9/13/2017 | 12:20:29 AM | Error | ARES.ABC | 21502 | Microsoft-Windows-Hyper-V-High-Availability | ‘Virtual Machine Configuration GPDYNAMICS-TEST’ failed to register the virtual machine with the virtual machine management service. The Virtual Machine Management Service failed to register the configuration for the virtual machine ‘103A58D4-8E02-4FD3-B121-DC56D0551082’ at ‘C:\ClusterStorage\DS-PROD09\gpdynamics-test\GPDYNAMICS-TEST’: The system cannot find the path specified. (0x80070003). If the virtual machine is managed by a failover cluster, ensure that the file is located at a path that is accessible to other nodes of the cluster. |
9/13/2017 | 12:20:29 AM | Error | ARES.ABC | 1069 | Microsoft-Windows-FailoverClustering | Cluster resource ‘Virtual Machine Configuration GPDYNAMICS-TEST’ of type ‘Virtual Machine Configuration’ in clustered role ‘GPDYNAMICS-TEST’ failed. The error code was ‘0x3’ (‘The system cannot find the path specified.’). Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet. |
Application Events:
- Went through the logs but was not able to find anything specific.
List of outdated drivers:
Time/Date String | Product Version | File Version | Company Name | File Description |
5/25/2016 8:01 | (7.13:65.105) | (7.13:65.105) | QLogic Corporation | QLogic 10 GigE VBD |
3/4/2016 21:22 | (10.0:11105.1001) | (6.603:6.0) | Avago Technologies | MEGASAS RAID Controller Driver for Windows |
5/16/2016 2:28 | (7.13:57.103) | (7.13:57.103) | QLogic Corporation | AMD64 BXND NDIS6.0 Driver |
3/4/2016 21:46 | (6.3:9600.16384) | (12.15:22.6) | Intel Corporation | Intel(R) Gigabit Adapter NDIS 6.x driver |
4/5/2017 23:45 | (9.5:0.1015) | (9.5:0.1015) | Veeam Software AG | CTK file system minifilter |
_______________________________________________________________________
System Information: MORPHEUS
OS Name Microsoft Windows Server 2016 Datacenter
Version 10.0.14393 Build 14393
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name MORPHEUS
System Manufacturer Dell Inc.
System Model PowerEdge R630
System Type x64-based PC
System SKU SKU=NotProvided;ModelName=PowerEdge R630
Processor Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, 2601 Mhz, 14 Core(s), 28 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz, 2601 Mhz, 14 Core(s), 28 Logical Processor(s)
BIOS Version/Date Dell Inc. 2.4.3, 17/01/2017
System Events:
- Reviewed the logs and found that the CSV went to paused state, after which we saw the events related to ISCSI which clearly says that the Connection to the Target it lost and the initiator was not able to send the ISCSI PDU.
Date | Time | Type/Level | Computer Name | Event Code | Source | Description |
9/11/2017 | 8:50:40 AM | Warning | MORPHEUS.ABC | 5120 | Microsoft-Windows-FailoverClustering | Cluster Shared Volume ‘DS-PROD09’ (‘DS-PROD09’) has entered a paused state because of ‘STATUS_CONNECTION_DISCONNECTED(c000020c)’. All I/O will temporarily be queued until a path to the volume is reestablished. |
9/11/2017 | 8:51:12 AM | Warning | MORPHEUS.ABC | 5120 | Microsoft-Windows-FailoverClustering | Cluster Shared Volume ‘DS-PROD09’ (‘DS-PROD09’) has entered a paused state because of ‘STATUS_CONNECTION_DISCONNECTED(c000020c)’. All I/O will temporarily be queued until a path to the volume is reestablished. |
9/11/2017 | 8:51:37 AM | Error | MORPHEUS.ABC | 20 | iScsiPrt | Connection to the target was lost. The initiator will attempt to retry the connection. |
9/11/2017 | 8:51:37 AM | Error | MORPHEUS.ABC | 7 | iScsiPrt | The initiator could not send an iSCSI PDU. Error status is given in the dump data. |
- Checked and found that the Virtual Machine failed to start as it was not able to reserve resources with the Error: Insufficient system resources exist to complete the requested service.
9/11/2017 | 11:58:27 AM | Error | MORPHEUS.ABC | 21502 | Microsoft-Windows-Hyper-V-High-Availability | ‘Virtual Machine GPDYNAMICS-TEST’ failed to start. ‘GPDYNAMICS-TEST’ failed to start. (Virtual machine ID 103A58D4-8E02-4FD3-B121-DC56D0551082) ‘GPDYNAMICS-TEST’ Synthetic Ethernet Port: Failed to finish reserving resources with Error ‘Insufficient system resources exist to complete the requested service.’ (0x800705AA). (Virtual machine ID 103A58D4-8E02-4FD3-B121-DC56D0551082) ‘GPDYNAMICS-TEST’ failed to allocate resources while connecting to a virtual network: Insufficient system resources exist to complete the requested service. (0x800705AA) (Virtual Machine ID 103A58D4-8E02-4FD3-B121-DC56D0551082). The Ethernet switch may not exist. Could not find Ethernet switch ‘OffNetwork’. |
9/11/2017 | 11:58:27 AM | Error | MORPHEUS.ABC | 1069 | Microsoft-Windows-FailoverClustering | Cluster resource ‘Virtual Machine GPDYNAMICS-TEST’ of type ‘Virtual Machine’ in clustered role ‘GPDYNAMICS-TEST’ failed. The error code was ‘0x5aa’ (‘Insufficient system resources exist to complete the requested service.’). Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet. |
9/11/2017 | 11:58:40 AM | Error | MORPHEUS.ABC | 1205 | Microsoft-Windows-FailoverClustering | The Cluster service failed to bring clustered role ‘GPDYNAMICS-TEST’ completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role. |
- Vmqueue failed to work and gave the following Error:
9/11/2017 | 3:56:44 PM | Error | MORPHEUS.ABC | 113 | Microsoft-Windows-Hyper-V-VmSwitch | Failed to allocate VMQ for NIC 43EABD46-7FEE-4444-AA88-9CC0D6B70A96–41B76058-E46B-48CD-A2E0-6A96370D7820 (Friendly Name: Network Adapter) on switch 67ABF766-9CAA-49D4-8F5E-09B4C8F5B3CB (Friendly Name: vSwitch1). Reason – Unknown. Status = {Operation Failed} The requested operation was unsuccessful. |
Application Events:
- Went through the logs but was not able to find anything specific.
List of outdated drivers:
Time/Date String | Product Version | File Version | Company Name | File Description |
5/25/2016 8:01 | (7.13:65.105) | (7.13:65.105) | QLogic Corporation | QLogic 10 GigE VBD |
3/4/2016 21:22 | (10.0:11105.1001) | (6.603:6.0) | Avago Technologies | MEGASAS RAID Controller Driver for Windows |
5/16/2016 2:28 | (7.13:57.103) | (7.13:57.103) | QLogic Corporation | AMD64 BXND NDIS6.0 Driver |
3/4/2016 21:46 | (6.3:9600.16384) | (12.15:22.6) | Intel Corporation | Intel(R) Gigabit Adapter NDIS 6.x driver |
__________________________________________________________________
Conclusion:
After analyzing the logs we can see that the issue is happening due to Networking components being over utilized due to which we are getting the error ‘Insufficient system resources exist to complete the requested service.’ and the virtual machine is failing to complete the task.
- Kindly update the BIOS from the Server using the following link if the server is outdated: https://downloads.dell.com/FOLDER04490198M/1/BIOS_Y4Y95_WN64_2.5.5.EXE
- Update the Network Adaptor Firmware drivers to the latest and add more Physical Nics if possible so that the Network is getting overwhelmed.
- Please get in touch with the Hardware vendor as per the Windows Server Catalog Compellent Storage Center 7.1 is not supported for 2016. Incase if there is any recent update available please install the update:
- Update the HBA