Deciphering Windows Failover Clustering : Advance

I have written this Article for those specialists group who have the basic idea on Failover Clustering and want to understand how really things work in background.

For this Article you have clear knowledge of Netmon, Procmon, Xperf and Cluster Logs

Some Basics you can get from my previous article:

  • Microsoft Failover Clustering consist of the below following Managers:

COMPONENT

FUNCTION

MESSAGING

[GEM] [MRR] [CAUSAL]

– messaging components used by cluster

– includes Good Enough Multicast [GEM], Multicast Request-Reply [MRR], and Causal Multicast [CAUSAL]

OBJECT MANAGER [OM]

– maintains database of all cluster objects in memory on each node in the cluster

HOST MANAGER [HM]

– listens for connections from cluster nodes on port 3343

– passes connections to Security Manager for encryption/signing, validation, and authorization

MEMBERSHIP MANAGER [MM] [JPM]

– runs multi-stage regroup algorithm [RGP]

– determines common view of the cluster

– maintains cluster node objects in Object Manager

– Join Policy Manager [JPM] decides the surviving cluster partition

GLOBAL UPDATE MANAGER [GUM]

– primary mechanism for propagating updates to cluster configuration to all nodes in the cluster view

– transfers cluster state to all joining nodes

– heavy user of cluster communication processes

RESOURCE CONTROL MANAGER [RCM]

– manages resources, groups of resources, resource type objects

– implements failover mechanisms and policies

– establishes and maintains resource dependency tree

TOPOLOGY MANAGER [TM] [ClNET] [IM] [NM]

– discovers and maintains the cluster network topology

– reports failures of any networks (Network Manager [NM]) or network interfaces (Interface Manager [IM])

– responsible for Microsoft Failover Cluster Virtual Adapter configuration (Cluster Network [ClNET])

DATABASE MANAGER [DM] [CLFS] [CM]

– manages the cluster database replicas (registry hive)

– places a replica of the configuration on the witness disk (if configured)

– uses the Common Log File System (CLFS) for updates

– uses Paxos tagging for versioning

– works with Checkpoint Manager [CM] via Resource Control Manager [RCM] for implementing registry and crypto checkpoints

QUORUM MANAGER [QUORUM]

– determines if the cluster view obtained from Membership Manager [MM] has achieved quorum

– tracks ownership of quorum replica located on a witness disk (if configured)

SECURITY MANAGER [SM] [SV]

– facilitates signing or encrypting of all messages exchanged among the nodes in a cluster

– works with Security Validator [SV] to determine SecurityLevel for cluster communications

– handles authentication and authorization functions for Host Manager [HM]

CORE [CORE]

– ‘backplane’ that everything plugs into…like a cluster ‘motherboard’ so to speak

CLUSTER SERVICE [CS]

– interface with Services Control Manager (SCM) for starting and stopping the cluster service

EVENT MANAGER [EM]

– propagates events from the cluster service components up the API (Application Programming Interface) layer

– events can include resource failures, network interface state changes, node state changes, etc…

RESOURCE [RES]

– log messages from resource DLLs

VERSIONING [VER]

– calculates the cluster version during form and join ensuring the nodes are compatible.

– immediately follows the Security Manager [SM] during the join process

NODE STATE MANAGER [NSM]

– tracks whether a node is ‘Paused’ or not

CLUSTER CONFIGURATION MANAGER [CCM]

– used to manage cluster components that do not have a dependency on the Database Manager [DM]. These would be components that are part of starting the Database Manager [DM] itself

REGROUP PROTOCOL [RGP]

– involves the ‘Regroup’ process when initiated by Membership Manager [MM]

RESOURCE HOST SUBSYSTEM [RHS]

– hosts all resources in the cluster

– in task manager it shows as rhs.exe

– multiple rhs.exe processes if resource is placed in its own resource monitor either by the cluster service or by a cluster administrator

NETFT EVENT MANAGER [NETFTEVM]

– processes network-related events from the NetFT driver and the network stack

– dispatches events to other cluster components in the cluster service…e.g. Membership Manager [MM]

  • As soon as we Run the New-Cluster Cmdlet we can see that the Cluster services are being Queries and Various Modules which are required for Cluster Service to run are being Loaded:

Process Name

Operation

Path

Result

Detail

services.exe

QuerySecurityFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Information: Owner, Group, DACL, SACL, Label

services.exe

QueryNameInformationFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Name: \Windows\Cluster\clussvc.exe

services.exe

QueryBasicInformationFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

CreationTime: 10/18/2018 10:41:19 PM, LastAccessTime: 10/18/2018 10:41:19 PM, LastWriteTime: 10/18/2018 10:41:19 PM, ChangeTime: 10/18/2018 10:41:27 PM, FileAttributes: A

services.exe

QuerySecurityFile

C:\Windows\Cluster\clussvc.exe

BUFFER OVERFLOW

Information: Owner

services.exe

QuerySecurityFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Information: Owner

csrss.exe

QuerySecurityFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Information: Owner, Group, DACL, SACL, Label

csrss.exe

QueryNameInformationFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Name: \Windows\Cluster\clussvc.exe

csrss.exe

QueryBasicInformationFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

CreationTime: 10/18/2018 10:41:19 PM, LastAccessTime: 10/18/2018 10:41:19 PM, LastWriteTime: 10/18/2018 10:41:19 PM, ChangeTime: 10/18/2018 10:41:27 PM, FileAttributes: A

csrss.exe

QuerySecurityFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Information: Owner, Group, DACL, SACL, Label

csrss.exe

QueryNameInformationFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Name: \Windows\Cluster\clussvc.exe

  • Here we can see that no Modules are yet Loaded by this Service.

  • Once the basics Queries are completed, we can see that various DLLS started loading with the Clussvc for its Proper Functioning:

Time of Day

Process Name

PID

TID

Operation

Path

Result

Detail

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\Cluster\clussvc.exe

SUCCESS

Image Base: 0x7ff791bf0000, Image Size: 0x71b000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\ntdll.dll

SUCCESS

Image Base: 0x7fff54de0000, Image Size: 0x1a9000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\kernel32.dll

SUCCESS

Image Base: 0x7fff53000000, Image Size: 0x13a000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\KernelBase.dll

SUCCESS

Image Base: 0x7fff52410000, Image Size: 0x10f000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\msvcrt.dll

SUCCESS

Image Base: 0x7fff52bf0000, Image Size: 0xa7000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\sechost.dll

SUCCESS

Image Base: 0x7fff53140000, Image Size: 0x57000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\clusapi.dll

SUCCESS

Image Base: 0x7fff4ced0000, Image Size: 0x6b000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\crypt32.dll

SUCCESS

Image Base: 0x7fff52230000, Image Size: 0x1d7000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\dhcpcsvc.dll

SUCCESS

Image Base: 0x7fff4e660000, Image Size: 0x19000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\IPHLPAPI.DLL

SUCCESS

Image Base: 0x7fff4e790000, Image Size: 0x29000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\ntdsapi.dll

SUCCESS

Image Base: 0x7fff4fb00000, Image Size: 0x28000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\ole32.dll

SUCCESS

Image Base: 0x7fff52930000, Image Size: 0x175000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\oleaut32.dll

SUCCESS

Image Base: 0x7fff52870000, Image Size: 0xb7000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\resutils.dll

SUCCESS

Image Base: 0x7fff4cf40000, Image Size: 0x49000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\rpcrt4.dll

SUCCESS

Image Base: 0x7fff52ab0000, Image Size: 0x136000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\sspicli.dll

SUCCESS

Image Base: 0x7fff51f40000, Image Size: 0x2b000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\sscore.dll

SUCCESS

Image Base: 0x7fff4dac0000, Image Size: 0xe000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\user32.dll

SUCCESS

Image Base: 0x7fff533a0000, Image Size: 0x171000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\vssapi.dll

SUCCESS

Image Base: 0x7fff4d4e0000, Image Size: 0x180000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\wevtapi.dll

SUCCESS

Image Base: 0x7fff4fd30000, Image Size: 0x67000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\ws2_32.dll

SUCCESS

Image Base: 0x7fff53580000, Image Size: 0x58000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\nsi.dll

SUCCESS

Image Base: 0x7fff54d70000, Image Size: 0x9000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\winnsi.dll

SUCCESS

Image Base: 0x7fff4f5b0000, Image Size: 0xa000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\FirewallAPI.dll

SUCCESS

Image Base: 0x7fff51120000, Image Size: 0xb6000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\netapi32.dll

SUCCESS

Image Base: 0x7fff4fae0000, Image Size: 0x15000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\fltLib.dll

SUCCESS

Image Base: 0x7fff4bd80000, Image Size: 0xa000

47:33.1

clussvc.exe

2652

2952

Load Image

C:\Windows\System32\setupapi.dll

SUCCESS

Image Base: 0x7fff52690000, Image Size: 0x1d6000

  • This can be seen in Service Properties in Procmon as well:

Since the Cluster Service Started we can see that it started looking for its Parameters and the ClusDB, However since this is a New Cluster it was not able to find them:

Time of Day

Process Name

PID

TID

Operation

Path

Result

Detail

47:33.8

clussvc.exe

2652

2556

ReadFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Offset: 812,032, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

47:33.8

clussvc.exe

2652

2556

ReadFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Offset: 5,104,640, Length: 32,768, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

47:33.8

clussvc.exe

2652

2556

RegCloseKey

HKLM\System\CurrentControlSet\Services\ClusSvc\Parameters

SUCCESS

 

47:33.8

clussvc.exe

2652

2556

CreateFile

C:\Windows\Cluster\CLUSDB.blf

NAME NOT FOUND

Desired Access: Read Attributes, Delete, Disposition: Open, Options: Non-Directory File, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a

47:33.8

clussvc.exe

2652

2556

CreateFile

C:\Windows\Cluster\CLUSDB.1.container

NAME NOT FOUND

Desired Access: Read Attributes, Delete, Disposition: Open, Options: Non-Directory File, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a

47:33.8

clussvc.exe

2652

2556

CreateFile

C:\Windows\Cluster\CLUSDB.2.container

NAME NOT FOUND

Desired Access: Read Attributes, Delete, Disposition: Open, Options: Non-Directory File, Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a

· When we have Initiated creating the Cluster and Specified the Nodes that we wanted to Add we can see the MSRPC and SMB Connection being established.

Time of Day

Process Name

Source

Destination

Protocol Name

Description

17:32.4

powershell.exe

2012R2Node2.ashu.com

dc.ashu.com

DRSR

 

17:32.4

powershell.exe

dc.ashu.com

2012R2Node2.ashu.com

DRSR

 

17:32.4

powershell.exe

2012R2Node2.ashu.com

dc.ashu.com

TCP

TCP:Flags=…A…., SrcPort=64694, DstPort=49667, PayloadLen=0, Seq=3285991931, Ack=3767158308, Win=4102 (scale factor 0x8) = 1050112

17:32.6

System

2012R2Node2.ashu.com

2012R2Node1.ashu.com

MSRPC

MSRPC:c/o Request: unknown Call=0x4 Opnum=0x11 Context=0x0 Hint=0x18

17:32.6

System

2012R2Node1.ashu.com

2012R2Node2.ashu.com

SMB2

SMB2:R WRITE (0x9), File=svcctl@#2342

17:32.6

System

2012R2Node2.ashu.com

2012R2Node1.ashu.com

SMB2

SMB2:C READ (0x8), FID=0x400000021 , 0x400 bytes from offset 0 (0x0)

17:32.6

System

2012R2Node1.ashu.com

2012R2Node2.ashu.com

MSRPC

MSRPC:c/o Response: unknown Call=0x4 Context=0x0 Hint=0x2C Cancels=0x0

· DCOM Connection Request and Response can be seen in the Netmon:

Time of Day

Process Name

Source

Destination

Protocol Name

Description

17:32.6

svchost.exe

2012R2Node2.ashu.com

2012R2Node1.ashu.com

DCOM

DCOM:RemoteCreateInstance Request

17:32.6

svchost.exe

2012R2Node1.ashu.com

2012R2Node2.ashu.com

DCOM

DCOM:RemoteCreateInstance Response

· Cluster Service Started on the Cluster Node from where we created the Cluster and then on the other Cluster Node:

INFO —————————–+ LOG BEGIN +—————————–
INFO [CS] Starting clussvc as a service
INFO [CS] cluster service logging level is 5

INFO [StartupConfig]: Initializing NetFT adapter management information.

· From the Procmon Logs we can see the Cluster Service started querying the Network Interface and gathering the Information:

Process Name

PID

TID

Operation

Path

Result

Detail

clussvc.exe

2652

2852

RegQueryValue

HKLM\System\CurrentControlSet\Control\Network\{4d36e974-e325-11ce-bfc1-08002be10318}\{B70D6460-3635-4D42-B866-B8AB1A24454C}\Ndi\FilterClass

SUCCESS

Type: REG_SZ, Length: 36, Data: ms_firewall_upper

clussvc.exe

2652

2852

RegQueryValue

HKLM\System\CurrentControlSet\Control\Network\{4d36e974-e325-11ce-bfc1-08002be10318}\{B70D6460-3635-4D42-B866-B8AB1A24454C}\DirectInstall

NAME NOT FOUND

Length: 144

clussvc.exe

2652

2852

CreateFile

C:\Windows\System32\drivers\ndiscap.sys

SUCCESS

Desired Access: Generic Read, Disposition: Open, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: n/a, ShareMode: Read, Delete, AllocationSize: n/a, OpenResult: Opened

clussvc.exe

2652

2852

QueryStandardInformationFile

C:\Windows\System32\drivers\en-US\ndiscap.sys.mui

SUCCESS

AllocationSize: 8,192, EndOfFile: 5,632, NumberOfLinks: 4, DeletePending: False, Directory: False

clussvc.exe

2652

2852

CreateFile

C:\Windows\System32\en-US\netcfgx.dll.mui

SUCCESS

Desired Access: Generic Read, Disposition: Open, Options: , Attributes: n/a, ShareMode: Read, Delete, AllocationSize: n/a, OpenResult: Opened

clussvc.exe

2652

2852

CreateFileMapping

C:\Windows\System32\en-US\netcfgx.dll.mui

SUCCESS

SyncType: SyncTypeOther

clussvc.exe

2652

2852

QueryDirectory

C:\Windows\Inf\netserv.inf

SUCCESS

Filter: netserv.inf, 1: netserv.inf

clussvc.exe

2652

2852

QueryDirectory

C:\Windows\Inf\mslbfoprovider.inf

SUCCESS

Filter: mslbfoprovider.inf, 1: mslbfoprovider.inf

clussvc.exe

2652

2852

CreateFile

C:\Windows\System32\DriverStore\en-US\MsLbfoProvider.inf_loc

SUCCESS

Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened

clussvc.exe

2652

2852

QueryBasicInformationFile

C:\Windows\System32\DriverStore\en-US\MsLbfoProvider.inf_loc

SUCCESS

CreationTime: 8/22/2013 9:44:33 PM, LastAccessTime: 8/22/2013 9:44:33 PM, LastWriteTime: 8/22/2013 9:44:33 PM, ChangeTime: 10/19/2018 10:49:08 AM, FileAttributes: A

· Netft Driver got Initialized:

INFO [NETFT] Initializing NetFT driver manager

· Core Initialization:

INFO [CORE] Initializing.

· Cluster Health Monitor Initializes: (Maybe)

INFO [CHM] Initializing

· Membership Manager Initializes:

INFO [MM] Initializing.

· Good Enough Multicast [GEM] initializes:

INFO [GEM] Initializing.

· Causal Multicast initializes:

INFO [Causal] Node 1: Initializing.

· Multicast Request-Reply [MRR]

INFO [MRR] Node 1: Initializing.

· EVENT MANAGER [EM] initializes:

INFO [EM] Initializing.

· Global Update Manager of Node 1 Initializes:

INFO [GUM] Node 1: Initializing.

· Just at the time of Initialization of the Database Manager we can see that the Cluster Service started creating the Registry Files for the Cluster Node:

Process Name

PID

TID

Operation

Path

Result

Detail

clussvc.exe

2652

2852

RegOpenKey

HKLM\CLUSTER

SUCCESS

Desired Access: All Access

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\ClusterName

SUCCESS

Type: REG_SZ, Length: 18, Data: cluster1

clussvc.exe

2652

2852

ReadFile

C:\Windows\Cluster\clussvc.exe

SUCCESS

Offset: 2,675,712, Length: 4,096, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\ClusterInstanceID

SUCCESS

Type: REG_SZ, Length: 74, Data: db8b3e66-0dd8-43e0-b58d-c47db6033dd3

clussvc.exe

2652

2852

RegCreateKey

HKLM\CLUSTER\Nodes

SUCCESS

Desired Access: Read, Disposition: REG_CREATED_NEW_KEY

clussvc.exe

2652

2852

RegCloseKey

HKLM\CLUSTER\Nodes

SUCCESS

 

clussvc.exe

2652

2852

RegOpenKey

HKLM\CLUSTER\Nodes\2

SUCCESS

Desired Access: Read/Write

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\2\NodeName

SUCCESS

Type: REG_SZ, Length: 24, Data: 2012R2Node1

clussvc.exe

2652

2852

RegCloseKey

HKLM\CLUSTER\Nodes\2

SUCCESS

 

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\2\NodeWeight

SUCCESS

Type: REG_DWORD, Length: 4, Data: 1

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\2\TargetNodeWeight

SUCCESS

Type: REG_DWORD, Length: 4, Data: 1

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\2\NodeInstanceID

SUCCESS

Type: REG_SZ, Length: 74, Data: 00000000-0000-0000-0000-000000000002

clussvc.exe

2652

2852

RegOpenKey

HKLM\CLUSTER\Nodes\1

SUCCESS

Desired Access: Read/Write

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\1\NodeName

SUCCESS

Type: REG_SZ, Length: 24, Data: 2012R2Node2

clussvc.exe

2652

2852

RegCloseKey

HKLM\CLUSTER\Nodes\1

SUCCESS

 

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\1\NodeWeight

SUCCESS

Type: REG_DWORD, Length: 4, Data: 1

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\1\TargetNodeWeight

SUCCESS

Type: REG_DWORD, Length: 4, Data: 1

clussvc.exe

2652

2852

RegSetValue

HKLM\CLUSTER\Nodes\1\NodeInstanceID

SUCCESS

Type: REG_SZ, Length: 74, Data: 00000000-0000-0000-0000-000000000001

clussvc.exe

2652

2852

RegCreateKey

HKLM\CLUSTER\Resources

SUCCESS

Desired Access: Read, Disposition: REG_CREATED_NEW_KEY

clussvc.exe

2652

2852

RegCloseKey

HKLM\CLUSTER\Resources

SUCCESS

 

clussvc.exe

2652

2852

RegCreateKey

HKLM\CLUSTER\ResourceTypes

SUCCESS

Desired Access: Read, Disposition: REG_CREATED_NEW_KEY

clussvc.exe

2652

2852

RegCloseKey

HKLM\CLUSTER\ResourceTypes

SUCCESS

 

· Database Manager of Node 1 Initializes:

INFO [DM] Node 1: Initializing.

· Common Log File System (CLFS) initializes:

INFO [CLFS] Initializing CLFS logger.
DBG [CLFS] Largest written record has size 131072 bytes
DBG [CLFS] Deleting old marshalling area.
DBG [CLFS] Marshaling area resized

· TOPOLOGY MANAGER Initializes:

INFO [TM] Initializing.

· VERSIONING [VER] Initializes:

INFO [VER]: Initializing.

· CLUSTER CONFIGURATION MANAGER Initializes:

INFO [CCM] Initializing.

· FTI – Fault-Tolerant Interface initializes:

INFO [FTI] Initializing.

· REM Initializes:

INFO [ReM] Initializing.

· Host Manager Initializes:

INFO [HM]: Initializing.

· Interface Manager [IM] Initializing:

INFO [IM] Initializing.

· Network Manager [NM] Initializing:

INFO [NM] Initializing.

·

· NODE STATE MANAGER [NSM] Initializing:

INFO [NSM] Initializing.

· Checkpoint Manager [CM] Initializing:

INFO [CM] Initializing.

· RESOURCE CONTROL MANAGER [RCM] Initializing:

INFO [RCM] Initializing.

· QUORUM MANAGER [QUORUM] Initializing:

INFO [QUORUM] Node 1: Initializing.

· StartupConfig Component is initiated on Cluster to collect the basic information of the Cluster and save it in the Local Cluster Database.

INFO [STARTUP] ClusterNode 1: startup config
INFO <StartupConfig localNodeId=’1′ testMode=’false’ newCluster=’true’
INFO <ClusterName> 2012R2 </ClusterName>
INFO <PaxosTag> 0:0:0 </PaxosTag>
INFO <ConfiguredNodes> (1 2) </ConfiguredNodes>
INFO <NodeWeights> (1 2) </NodeWeights>
INFO <Nodes>
INFO <Node name=’2012R2Node1′ id=’1’/>
INFO <Node name=’2012R2Node2′ id=’2’/>
INFO </Nodes>
INFO <LocalHiveFile> C:\windows\Cluster\CLUSDB </LocalHiveFile>
INFO <LocalHiveKey> Cluster </LocalHiveKey>
INFO <ParametersKey> System\\CurrentControlSet\\Services\ClusSvc\Parameters </ParametersKey>
INFO <Virtual IPv6> fe80::ad5f:7a11:f49e:353d%23:~3343~ </Virtual IPv6>
INFO <Virtual IPv4> 169.254.1.2:~3343~ </Virtual IPv4>
INFO <NoVersionCheck> false </NoVersionCheck>
INFO <Cookie> 708991b2-b4d7-47bd-90b5-f003a977d1e4 </Cookie>
INFO <ClusterGuid> bc4692a3-e320-48b3-ab2f-45d722fda459 </ClusterGuid>
INFO <RetainNetFTConnSecurity> false </RetainNetFTConnSecurity>
INFO </StartupConfig>

· CNO Creation Phase is Complete

INFO [CS] ClusterNode object creation completed.

· Cluster Node got started as the Cluster service started:

INFO [CS] Starting cluster node
INFO [CS] Cluster Service started

· Database Manager finally started to create its first ClusDB File:

INFO [CS] Starting DM

· DM Started Reading Quorum Config:

INFO [DM] Node 1: Reading quorum config
INFO [DM] Node 1: new cluster. generating new hive

· It tried to look for ClusDB file but it failed with Error 2 (File Not Found) :: Because there is no Existing ClusBD Since it’s a New Setup and a New File needs to be Created:

DBG [DM] Delete hive C:\windows\Cluster\CLUSDB failed: (2)

· CLFS Logger got initialize to start the Logging Process:

INFO [CLFS] Initializing CLFS logger.

DBG [DM] Creating Hive, Key Cluster, FilePath C:\windows\Cluster\CLUSDB

INFO [CS] Starting QM
INFO [CS] Starting MM

INFO [DM] Hive::Load Key:Cluster FilePath:C:\windows\Cluster\CLUSDB ignoreDiscardErrors:false LoadExisting:false
INFO [DM] Loading Hive, Key Cluster, FilePath C:\windows\Cluster\CLUSDB

Total available size: 10485760
Current available size: 10353152
Total reservation size: 132608
Total containers: 2

· Quorum create two Unique ID for Node 1 and Node 2.

INFO [QUORUM] unique node id for new node 1 is 00000000-0000-0000-0000-000000000001
INFO [QUORUM] unique node id for new node 2 is 00000000-0000-0000-0000-000000000002

·

· Here we can see that the Database Manager started creating Entries on the Cluster Database, If you will see below all the entries are created one by one checked the information that is gathered by the cluster service.

DBG [DM] Received setvalue for AdminAccessPoint in key
DBG [DM] Received setvalue for MaxQuorumLogSize in key Quorum
DBG [DM] Received setvalue for RecentEventsResetTime in key

· StartupConfig data is populated again by the [DM]:

INFO [DM] Populating StartupConfig data from local hive.
INFO [DM] Node 1: startup configuration loaded:
INFO <StartupConfig localNodeId=’1′ testMode=’false’ newCluster=’true’
INFO <ClusterName> 2012R2 </ClusterName>
INFO <PaxosTag> 0:0:1 </PaxosTag>
INFO <ConfiguredNodes> (1 2) </ConfiguredNodes>
INFO <NodeWeights> (1 2) </NodeWeights>
INFO <Nodes>
INFO <Node name=’2012R2Node1′ id=’1’/>
INFO <Node name=’2012R2Node2′ id=’2’/>
INFO </Nodes>
INFO <LocalHiveFile> C:\windows\Cluster\CLUSDB </LocalHiveFile>
INFO <LocalHiveKey> Cluster </LocalHiveKey>
INFO <ParametersKey> System\\CurrentControlSet\\Services\ClusSvc\Parameters </ParametersKey>
INFO <Virtual IPv6> fe80::ad5f:7a11:f49e:353d%23:~3343~ </Virtual IPv6>
INFO <Virtual IPv4> 169.254.1.2:~3343~ </Virtual IPv4>
INFO <NoVersionCheck> false </NoVersionCheck>
INFO <Cookie> 708991b2-b4d7-47bd-90b5-f003a977d1e4 </Cookie>
INFO <ClusterGuid> bc4692a3-e320-48b3-ab2f-45d722fda459 </ClusterGuid>
INFO <RetainNetFTConnSecurity> false </RetainNetFTConnSecurity>
INFO </StartupConfig>

· Topology Manager, Interface Manager and Network Manager is now started to manager the Networking Aspects of the Cluster Service:

INFO [CS] Starting TM

INFO [CS] Starting IM
INFO [CS] Starting NM

· Network Manager received the Request from the Network Address Associated with Cluster Node1

DBG [NM] Received request from client address fe80::78ea:6ad:1b88:fc51.

DBG [API] Authenticated client–Client: GGN\Administrator Interface: b97db8b2-4c63-11cf-bff6-08002be23f2f Server: node1$@GGN.COM Level: RPC_C_AUTHN_LEVEL_PKT_PRIVACY Service: RPC_C_AUTHN_WINNT Protocol Sequence: ncacn_ip_tcp Client Address: fe80::78ea:6ad:1b88:fc51 Network Option: .

· Cluster Service Started Quorum and Membership Manager:

INFO [CS] Starting QM
INFO [CS] Starting MM

· Membership Manager got Initiated and Initiated a [RGP] Regroup Process:

INFO [MM] Starting.
INFO [RGP] node 1: emit view 001(1)

· StartupConfig information is now being added to the Object Manager which holds the Database of the Cluster in the Memory:

INFO [StartupConfig] Writing new StartupConfig entries to ObjectManager
DBG [StartupConfig] Writing node 2012R2Node2 (id = 2) to ObjectManager
INFO [QUORUM] unique node id for new node 2 is 00000000-0000-0000-0000-000000000002

· Object Manager Information:

INFO <ObjectManager>
INFO <ROOT id=’.root’ name=’.root’>
INFO <CHMA id=’ConnectivityMonitorAgent’ name=’ConnectivityMonitorAgent’/>
INFO <DEAD id=’.Dead’ name=’.Dead’>
INFO <NODE id=’2′ name=’2012R2Node2’/>
INFO </DEAD>
INFO <LIVE id=’.Live’ name=’.Live’/>
INFO <JOIN id=’.Joiners’ name=’.Joiners’/>
INFO <MBRA id=’MembershipAgent’ name=’MembershipAgent’/>
INFO <GEMA id=’GemAgent’ name=’GemAgent1’/>
INFO <CAUS id=’CausalAgent’ name=’CausalAgent’/>
INFO <MRRA id=’MrrAgent1′ name=’MrrAgent1’/>
INFO <EVTM id=’EventManager’ name=’EventManager’/>
INFO <GUMA id=’GumAgent1′ name=’GumAgent1’/>
INFO <JPMA id=’JpmAgent’ name=’JpmAgent’/>
INFO <IMSC id='[IM-1] ‘ name='[IM-1] ‘/>
INFO <NMAG id='[NM-1] ‘ name='[NM-1] ‘/>
INFO <STML id=’STM’ name=’STM’/>
INFO <RPCS id='[RPC-1] ‘ name='[RPC-1] ‘/>
INFO <RCMA id=’RcmAgent’ name=’RcmAgent’/>
INFO <QRMA id=’QuorumAgent1′ name=’QuorumAgent1’/>
INFO <SNAP id=’CsvSnapshotAgent’ name=’CsvSnapshotAgent’/>
INFO <WRTA id=’VssWriterAgent’ name=’VssWriterAgent’/>
INFO <DCMA id=’DcmAgent’ name=’DcmAgent’/>
INFO <VSSR id=’Cluster VSS Writer’ name=’Cluster VSS Writer’/>
INFO <CLUS id=’2012R2′ name=’2012R2’/>
INFO <NMRR id='[NetworkMRR-1] ‘ name='[NetworkMRR-1] ‘/>
INFO <TP2P id='[RouteUnicast-1] ‘ name='[RouteUnicast-1] ‘/>
INFO </ROOT>
INFO </ObjectManager>

· Membership Manager Knows That there is another Node which is part of the Cluster so a New Node Object is created.

INFO [MM] Node 1: got a network stream to 1. Creating new NodeObject
INFO [QUORUM] unique node id for new node 1 is 00000000-0000-0000-0000-000000000001

· Cluster Service Started the GEMV2.

INFO [CS] Starting GEMV2
INFO [CS] Starting HM Server

  • At this point we only have one Node which wants to join the cluster as the Proposed View is 001(1) Node ID of the Joiner is 1:

INFO [GEM] Node 1: Pause (proposedView = 001(1), currentView = 001(1), joiners = (), InSteadyState = true)
INFO [GEM] Node 1: EnterRepairStage1: Gem agent for node 1

INFO [GEM] Node 1: GEM is green
INFO [GEM] Node 1: Emitting Gem View 001(1)
INFO [CORE] Node 1: New View is <ViewChanged joiners=(1) downers=() newView=001(1) oldView=000() joiner=true form=true/> (Start Dispatch)

  • Join Policy Manager is Initiated as we now have to start the Phase to Add another Node to the Cluster.

INFO [NODE] Node 1: New join with n2: stage: ‘Listen for Connection’
INFO [CS] Starting JPM (attempting connections to other cluster nodes)
INFO [QUORUM] Node 1: I am the form/join coordinator (offline and lowest number node). Launching worker to request quorum config from other nodes

  • JPN Initiated Building Contact between the Two Nodes:

DBG [JPM] Node 1: Dead node mentioned 2012R2Node1
DBG [JPM] Node 1: Dead node mentioned 2012R2Node2
DBG [JPM] Node 1: Building contact data for node 2012R2Node2.

  • DNS Query was initiated from the Server to get the Basic Name/IP information:

INFO [JPM] Node 1: Scheduling DNS query for node name 2012R2Node2
DBG [JPM] Node 1: no valid contact endpoints yet for node 2012R2Node2.

  •  We were able to find the IP Address with IP Address as Node 2:

INFO [JPM] Node 1: Found possible contact address 192.168.2.61:~0~ for node 2012R2Node2 via DNS.

  • Watchdog Timer was Initialized and once the Timer was Initialized the Cluster Service Started Preparing for the Heartbeat Operations and IsAlive, LooksAlive Tests:

INFO [CS] Starting the Watchdog Monitor
INFO [WM] Monitoring the kernel-mode watchdog.
DBG [RCM] Clussvc heartbeat Timeout=60 seconds Action=BugCheck Node.

  • Heartbeat and the Service Alive was initialized:

DBG [NETFT] Enabling heartbeats

INFO [CORE] Updating user mode deadlock recovery values. Action: UserModeActionType::TerminateProcess, Timeout: 60.

  •  Cluster Service Reported to Service Control Manager that the Service has Started:

INFO [CS] Reporting to SCM that cluster service has started.

  •  New Routs are Established between the Cluster Node 1 and Node 2

 427 00000b50.00000bc4::2018/10/18-19:17:38.723 DBG [HM] Trying to connect to 2012R2Node2 at 10.10.10.24:~3343~
428 00000b50.00000810::2018/10/18-19:17:38.723 INFO [CONNECT] 10.10.10.24:~3343~: Established connection to remote endpoint 10.10.10.24:~3343~.

429 00000b50.00000810::2018/10/18-19:17:38.723 INFO [NODE] Node 2: New join with n1: stage: ‘Attempt Initial Connection’
430 00000b50.00000810::2018/10/18-19:17:38.723 INFO [SV] New real route: local (10.10.10.23:~56450~) to remote 2012R2Node2 (10.10.10.24:~3343~).
431 00000b50.00000810::2018/10/18-19:17:38.723 INFO [SV] Got a new outgoing stream to 2012R2Node2 at 10.10.10.24:~3343~
432 00000b50.00000810::2018/10/18-19:17:38.723 DBG [SM] Joiner: Initialized with SPN = 2012R2Node2, RequiredCtxAttrib = 0, HandShakeTimeout = 40000
433 00000b50.000002bc::2018/10/18-19:17:38.723 DBG [SM] Handling auth handshake posted by thread id 2064
434 00000b50.000002bc::2018/10/18-19:17:38.723 DBG [SM] Joiner: ISC returned status = 590610 output Blob size 1533, service principal name HOST/2012R2Node2, auth type MSG_AUTH_PACKAGE::KerberosAuth

  • These Network Queries can be seen in the Netmon Traces as well:

Time of Day

Process Name

Source

Destination

Protocol Name

Description

17:38.7

clussvc.exe

2012R2Node1.ashu.com

2012R2Node2.ashu.com

TCP

TCP:Flags=CE….S., SrcPort=56450, DstPort=3343, PayloadLen=0, Seq=813282654, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192

17:38.7

clussvc.exe

2012R2Node2.ashu.com

2012R2Node1.ashu.com

TCP

TCP:Flags=.E.A..S., SrcPort=3343, DstPort=56450, PayloadLen=0, Seq=1155349463, Ack=813282655, Win=8192 ( Negotiated scale factor 0x8 ) = 2097152

17:38.7

clussvc.exe

2012R2Node1.ashu.com

2012R2Node2.ashu.com

TCP

TCP:Flags=…A…., SrcPort=56450, DstPort=3343, PayloadLen=0, Seq=813282655, Ack=1155349464, Win=4106 (scale factor 0x8) = 1051136

17:38.8

clussvc.exe

2012R2Node1.ashu.com

2012R2Node2.ashu.com

TCP

TCP:Flags=…AP…, SrcPort=56450, DstPort=3343, PayloadLen=24, Seq=813282655 – 813282679, Ack=1155349464, Win=4106 (scale factor 0x8) = 1051136

17:38.8

clussvc.exe

2012R2Node1.ashu.com

2012R2Node2.ashu.com

TCP

TCP:Flags=…A…., SrcPort=56450, DstPort=3343, PayloadLen=1460, Seq=813282679 – 813284139, Ack=1155349464, Win=4106 (scale factor 0x8) = 1051136

  • Fault Tolerance Interface was initialize to find the Routes and connections which can be used to Route Cluster Traffic:

463 00000b50.00000810::2018/10/18-19:17:38.786 INFO [FTI] Got new raw TCP/IP connection.
464 00000b50.00000810::2018/10/18-19:17:38.786 INFO [NODE] Node 2: New join with n1: stage: ‘Exchange Node Information’
465 00000b50.00000810::2018/10/18-19:17:38.786 INFO [FTI][Follower] This node (2) is not the initiator
466 00000b50.00000810::2018/10/18-19:17:38.786 DBG [FTI] Stream already exists to node 1: false
467 00000b50.00000810::2018/10/18-19:17:38.817 INFO [FTI][Follower] Trying to select best endpoints among 169.254.2.15:~3343~, fe80::dc92:4c9d:19f3:af1d:~3343~ (first pair) and 169.254.1.32:~3343~, fe80::2105:4fc6:807c:c8c:~3343~ (second pair)
468 00000b50.00000810::2018/10/18-19:17:38.817 INFO [NODE] Node 2: New join with n1: stage: ‘Verify Network Role’

  • Below is the Route Information:

470 00000b50.00000810::2018/10/18-19:17:38.833 INFO [NETFT] Added route <struct mscs::FaultTolerantRoute>
471 00000b50.00000810::2018/10/18-19:17:38.833 INFO <realLocal>10.10.10.23:~3343~</realLocal>
472 00000b50.00000810::2018/10/18-19:17:38.833 INFO <realRemote>10.10.10.24:~3343~</realRemote>
473 00000b50.00000810::2018/10/18-19:17:38.833 INFO <virtualLocal>fe80::dc92:4c9d:19f3:af1d:~3343~</virtualLocal>
474 00000b50.00000810::2018/10/18-19:17:38.833 INFO <virtualRemote>fe80::2105:4fc6:807c:c8c:~3343~</virtualRemote>
475 00000b50.00000810::2018/10/18-19:17:38.833 INFO <Delay>1000</Delay>
476 00000b50.00000810::2018/10/18-19:17:38.833 INFO <Threshold>5</Threshold>
477 00000b50.00000810::2018/10/18-19:17:38.833 INFO <Priority>99</Priority>
478 00000b50.00000810::2018/10/18-19:17:38.833 INFO <Attributes>1</Attributes>
479 00000b50.00000810::2018/10/18-19:17:38.833 INFO </struct mscs::FaultTolerantRoute>

  • FTI is not waiting for the Route to Come online.

481 00000b50.00000810::2018/10/18-19:17:38.833 INFO [FTI][Follower] waiting for route to node 2012R2Node2 on virtual IP fe80::2105:4fc6:807c:c8c:~3343~ to come up

  • Here the First Traffic Between Node 1 and Node 2 on Port 3343 is initiated:

Time of Day

Process Name

Source

Destination

Protocol Name

Description

17:38.9

System

2012R2Node1.ashu.com

2012R2Node2.ashu.com

UDP

UDP:SrcPort = 3343, DstPort = 3343, Length = 112

  • NETFT EVENT MANAGER Notifying to other Managers that the Connection is reachable:

483 00000b50.00000860::2018/10/18-19:17:39.005 DBG [NETFTEVM] FTI NetFT event handler got event: Remote endpoint 10.10.10.24:~3343~ reachable from 10.10.10.23:~3343~
484 00000b50.00000860::2018/10/18-19:17:39.005 DBG [NETFTEVM] TM NetFT event handler got event: Remote endpoint 10.10.10.24:~3343~ reachable from 10.10.10.23:~3343~
486 00000b50.00000940::2018/10/18-19:17:39.005 DBG [NETFTEVM] FTI NetFT event dispatcher pushing event: Remote endpoint 10.10.10.24:~3343~ reachable from 10.10.10.23:~3343~
489 00000b50.000005dc::2018/10/18-19:17:39.005 DBG [NETFTEVM] TM NetFT event dispatcher pushing event: Remote endpoint 10.10.10.24:~3343~ reachable from 10.10.10.23:~3343~

  • Inbound Request for the Heartbeat is received and Accepted by the Node1:

491 00000b50.00000810::2018/10/18-19:17:39.005 INFO [NODE] Node 2: New join with n1: stage: ‘Wait for Heartbeats on Initial NetFT Route’
492 00000b50.00000810::2018/10/18-19:17:39.005 DBG [FTW] NetFT address fe80::dc92:4c9d:19f3:af1d:~3343~ is ready.
493 00000b50.00000810::2018/10/18-19:17:39.005 INFO [FTW] NetFT is ready after 0 msecs wait.
494 00000b50.00000810::2018/10/18-19:17:39.005 INFO [NODE] Node 2: New join with n1: stage: ‘Wait for NetFT Duplicate Address Detection’
495 00000b50.00000bd8::2018/10/18-19:17:39.005 INFO [ACCEPT] :::~3343~: Accepted inbound connection from remote endpoint fe80::2105:4fc6:807c:c8c%24:~64723~.

  • We were able to Establish the TCP Connection Between the Nodes and the Version Verification is initiated:

427 00000b50.00000bc4::2018/10/18-19:17:38.723 DBG [HM] Trying to connect to 2012R2Node2 at 10.10.10.24:~3343~
428 00000b50.00000810::2018/10/18-19:17:38.723 INFO [CONNECT] 10.10.10.24:~3343~: Established connection to remote endpoint 10.10.10.24:~3343~.

  • Security Validator is used to Determine the Security Level of Cluster Communication:

INFO [NODE] Node 1: New join with n2: stage: ‘Establish Connection over NetFT Adapter’

INFO [SV] Route local (fe80::dc92:4c9d:19f3:af1d%24:~3343~) to remote (fe80::2105:4fc6:807c:c8c%24:~64723~) exists. Forwarding to alternate path.
DBG [SM] Handling auth handshake posted by thread id 2064
INFO [SV] Route local (fe80::ad5f:7a11:f49e:353d%23:~49727~) to remote 2012R2Node2 (fe80::9d75:9382:2b68:61dc%23:~3343~) exists. Forwarding to alternate path.
INFO [SV] Authentication and authorization were successful

DBG [SV] Incoming (second) connection from 2012R2Node2 is secure

  • Node 2 Initiated the Incoming Connection and got the below Events:

506 00000b50.00000810::2018/10/18-19:17:39.005 DBG [SV] Incoming (second) connection from 2012R2Node2 is secure

507 00000b50.00000810::2018/10/18-19:17:39.005 INFO [ReM] Got stream info from fe80::dc92:4c9d:19f3:af1d%24:~3343~ to fe80::2105:4fc6:807c:c8c%24:~64723~.
508 00000b50.00000810::2018/10/18-19:17:39.005 DBG [ReM] Exchanging local info.
509 00000b50.00000810::2018/10/18-19:17:39.005 DBG [ReM] Sending local info.
510 00000b50.00000810::2018/10/18-19:17:39.005 DBG [ReM] Local info sent, receiving remote info.
511 00000b50.00000810::2018/10/18-19:17:39.005 DBG [ReM] Remote info received from 1:2012R2Node2.
514 00000b50.00000810::2018/10/18-19:17:39.036 DBG [ReM][Follower] I am the follower with n1.
515 00000b50.00000810::2018/10/18-19:17:39.085 INFO [ReM][Follower] Got remote data from n1, epoch: 0, sn: 0, Fault Tolerant Session Id: 00000000-0000-0000-0000-000000000000
516 00000b50.00000810::2018/10/18-19:17:39.085 DBG [NODE] Node 2: To n1 getting epoch (currently 0)
517 00000b50.00000810::2018/10/18-19:17:39.085 DBG [ReM][Follower] Current state with n1, epoch: 0, sn: 0
518 00000b50.00000810::2018/10/18-19:17:39.085 DBG [ReM][Follower] Successfully sent current state to 1.
519 00000b50.00000810::2018/10/18-19:17:39.179 INFO [ReM][Follower] Got direction from 1. Epoch is now 1, will resume from SN 0. Fault Tolerant Session ID is f1448006-e375-4f9a-aca5-5c2e1c7c8093
520 00000b50.00000810::2018/10/18-19:17:39.179 INFO [ReM] Sending connection down normal path.

  •  NetFT Route needs to be Updated:

INFO [NODE] Node 1: New join with n2: stage: ‘Update NetFT Route’

  •  Join Policy Manager Updated this Information:

INFO [JPM] Received a new stream from 2012R2Node2
INFO [NODE] Node 1: New join with n2: stage: ‘Send Current Membership Status for Join Policy’

  • Membership Manager added the Node to the Stream and will now initiated the Regroup Process:

INFO [MM] Node 1: Adding a stream to existing node 2

INFO [RGP] node 1: Node Connected 2 00000000000000000000000000000000000000000000000000000000000000110

INFO [RGP] node 1: Tick
INFO [RGP] node 1: sending out invites

INFO [CORE] Node 1: Proposed View is <ViewChanged joiners=(2) downers=() newView=201(1 2) oldView=001(1) joiner=false form=false/>

INFO [CHM] Node hasn’t received route weights from all the nodes. Best subset will be computed with partial route weights information
INFO [GEM] Node 1: Pause (proposedView = 201(1 2), currentView = 001(1), joiners = (2), InSteadyState = true)

  • Routes are added with the help of NETFTEVM:

569 00000b50.000005dc::2018/10/18-19:17:39.996 INFO [IM] got event: Local endpoint fe80::dc92:4c9d:19f3:af1d:~0~ added


570 00000b50.00000bc4::2018/10/18-19:17:39.996 DBG [NETFTAPI] Signaled NetftLocalConnect event for fe80::dc92:4c9d:19f3:af1d
571 00000b50.00000bc4::2018/10/18-19:17:39.996 DBG [NETFTEVM] FTI NetFT event handler got event: Local endpoint fe80::dc92:4c9d:19f3:af1d:~0~ connected
572 00000b50.00000bc4::2018/10/18-19:17:39.996 DBG [NETFTEVM] TM NetFT event handler got event: Local endpoint fe80::dc92:4c9d:19f3:af1d:~0~ connected
573 00000b50.00000bc4::2018/10/18-19:17:39.996 DBG [WM] Filtering event NETFT_LOCAL_CONNECT? 1
574 00000b50.00000940::2018/10/18-19:17:39.996 DBG [NETFTEVM] FTI NetFT event dispatcher pushing event: Local endpoint fe80::dc92:4c9d:19f3:af1d:~0~ connected
575 00000b50.000005dc::2018/10/18-19:17:39.996 DBG [NETFTEVM] TM NetFT event dispatcher pushing event: Local endpoint fe80::dc92:4c9d:19f3:af1d:~0~ connected

576 00000b50.000005dc::2018/10/18-19:17:39.996 INFO [IM] got event: Local endpoint fe80::dc92:4c9d:19f3:af1d:~0~ connected

  • All Known Nodes are part of the Quorum View:

DBG [JPM] All known connected nodes are in my quorum view.

  • CORE started Calling difference Managers:

INFO [CORE] Node 1: Calling form for Global Update Manager

INFO [CORE] Node 1: Calling form for Causal Multicast
INFO [CORE] Node 1: Calling form for QuorumManagerFast
INFO [CORE] Node 1: Calling form for RcmAgentFast
INFO [CORE] Node 1: Form. Preparing State to transfer
INFO [CORE] Node 1: Extracting state for Global Update Manager
INFO [CORE] Node 1: Extracting state for Causal Multicast
INFO [CORE] Node 1: Extracting state for QuorumManagerFast
INFO [CORE] Node 1: Extracting state for RcmAgentFast
INFO [CORE] Node 1: Calling form for Database Manager
INFO [DM] Node 1: Loading DM hive
INFO [CORE] Node 1: Calling form for Cluster Config Manager

  • Cluster Configuration Manager Called the JOIN, after which Node Specific Managers Initializing:

993 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [CORE] Node 2: Installing state for Cluster Config Manager
994 00000b50.00000ae8::2018/10/18-19:17:41.855 DBG [CCM] Join.
995 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [CORE] Node 2: Installing state for Checkpoint Manager
996 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [CORE] Node 2: Installing state for Topology Manager

1012 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [CORE] Node 2: Installing state for Node State Manager
1013 00000b50.00000ae8::2018/10/18-19:17:41.855 DBG [NSM] Join.
1014 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [CORE] Node 2: Installing state for DiskControlManagerBeforeRcm
1015 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [CORE] Node 2: Installing state for QuorumManagerBeforeRcm
1016 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [CORE] Node 2: Installing state for Resource Control Manager

  • Cluster Reinitialized the Cluster Operations and keep on Installing the Node Specific Managers:

1020 00000b50.00000ae8::2018/10/18-19:17:41.855 DBG [RCM] orphan group handlers: requested=0, started=0, finished=0
1021 00000b50.00000ae8::2018/10/18-19:17:41.855 DBG [RCM] rcm::RcmAgent::GetShutdownTimeout: [Shutdown] Total shutdown timeout:1200000
1022 00000b50.00000ae8::2018/10/18-19:17:41.855 DBG [RCM] rcm::RcmAgent::DrainOnShutdown(1200000)
1023 00000b50.00000ae8::2018/10/18-19:17:41.855 INFO [RCM] rcm::RcmAgent::DrainOnShutdown: Time elapsed 0
1024 00000b50.00000ae8::2018/10/18-19:17:41.855 DBG [RCM] rcm::GroupIntelligenceMgr::Close(true)
1025 00000b50.00000ae8::2018/10/18-19:17:41.871 ERR [RCM] rcm::DrainMgr::Cleanup: [DrainMgr] cleanup called without being initialized.
1026 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmAgent::Unload: [Shutdown] TimeElapsed:16 before calling ShutdownAllGroups
1027 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmGroup::ShutdownAllGroups()
1028 00000b50.00000ae8::2018/10/18-19:17:41.871 INFO [RCM] rcm::RcmGroup::ShutdownAllGroups: [Shutdown] Rcm shutting down due to reason:ServiceStopReason::Other. ShutdownTimeout:1199984
1029 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmGroup::ShutdownAllGroups: [Shutdown] shutting down non-infrastructure groups
1030 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmGroup::ShutdownAllGroups: [Shutdown] Shutting down csv groups
1031 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmGroup::ShutdownAllGroups: [Shutdown] Shutting down pool and core cluster groups
1032 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmGroup::ShutdownAllGroups: [Shutdown] Shutting down quorum group
1033 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] Disabling API calls
1034 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [NETFT] Disabling heartbeats
1035 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [NETFT] Disabling heartbeats

1036 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [NETFT] Enabling heartbeats

1037 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] set netft heartbeat interval to 900 seconds
1038 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] Enabled API calls
1039 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmResType::InitializeFromDb()
1040 00000b50.00000ae8::2018/10/18-19:17:41.871 DBG [RCM] rcm::RcmGroup::InitializeFromDb()
1041 00000b50.00000ae8::2018/10/18-19:17:41.886 DBG [RCM] rcm::RcmDependency::InitializeFromDb()
1042 00000b50.00000ae8::2018/10/18-19:17:41.886 INFO [API] Online read only
1043 00000b50.00000ae8::2018/10/18-19:17:41.886 INFO [CORE] Node 2: Installing state for Event Manager
1044 00000b50.00000ae8::2018/10/18-19:17:41.886 INFO [CORE] Node 2: Installing state for Quorum Manager
1045 00000b50.00000ae8::2018/10/18-19:17:41.886 INFO [QUORUM] Node 2: Target Node Weights are changed from () to (1 2)
1046 00000b50.00000ae8::2018/10/18-19:17:41.902 INFO [CORE] Node 2: Installing state for Interface Manager
1047 00000b50.00000ae8::2018/10/18-19:17:41.902 INFO [CORE] Node 2: Installing state for Network Manager
1048 00000b50.00000ae8::2018/10/18-19:17:41.902 INFO [CORE] Node 2: Installing state for Disk Control Manager
1049 00000b50.00000ae8::2018/10/18-19:17:41.902 DBG [DCM] Join, GlobalSequenceNumber: 2

  •  Cluster Parameters are being reviewed by the RCM:

1096 00000b50.00000ae8::2018/10/18-19:17:42.199 DBG [RCM] Clussvc heartbeat Timeout=60 seconds Action=BugCheck Node.
1097 00000b50.00000ae8::2018/10/18-19:17:42.199 DBG [NETFT] Enabling heartbeats
1098 00000b50.00000ae8::2018/10/18-19:17:42.199 INFO [CORE] Updating user mode deadlock recovery values. Action: UserModeActionType::TerminateProcess, Timeout: 60.
1099 00000b50.00000ae8::2018/10/18-19:17:42.199 INFO [CS] cluster service logging level is 3
1100 00000b50.00000ae8::2018/10/18-19:17:42.199 INFO [CS] GEM message buffer length is 50.
1101 00000b50.00000ae8::2018/10/18-19:17:42.199 INFO [QUORUM] Node 2: Dynamic Quorum set to true
1102 00000b50.00000ae8::2018/10/18-19:17:42.199 INFO [QUORUM] Node 2: Lower priority node is set to 0
1104 00000b50.00000ae8::2018/10/18-19:17:42.199 INFO [NETFT] Route history length is now: 10
1105 00000b50.00000ae8::2018/10/18-19:17:42.199 INFO [CS] Setting route history length to 10.

  • Cluster Join was Successful:

1108 00000b50.00000810::2018/10/18-19:17:42.214 INFO [NODE] Node 2: New join with n1: stage: ‘Join Succeeded’

  • Since the Cluster Creation was successful we started by created Resource Types and here our First RHS Was initialized.

 

1388 00000c20.00000c24::2018/10/18-19:17:46.804 INFO [RHS] Initializing.

· At the same time we can see the Addition of RHS Resource Related Registries being added to the Cluster Services.

Process Name

PID

TID

Operation

Path

Result

Detail

clussvc.exe

2652

3172

RegQueryKey

HKLM\System\CurrentControlSet\Services\ClusSvc\Parameters

SUCCESS

Query: HandleTags, HandleTags: 0x0

clussvc.exe

2652

3172

RegCreateKey

HKLM\System\CurrentControlSet\Services\ClusSvc\Parameters\Rhs

SUCCESS

Desired Access: Maximum Allowed, Granted Access: None 0x0, Disposition: REG_CREATED_NEW_KEY

clussvc.exe

2652

3172

RegCreateKey

HKLM\System\CurrentControlSet\Services\ClusSvc\Parameters\Rhs\13e895cf-2c3c-4781-9445-d1980ba124dd

SUCCESS

Desired Access: Read/Write, Write DAC, Disposition: REG_CREATED_NEW_KEY

clussvc.exe

2652

3172

RegCloseKey

HKLM\System\CurrentControlSet\Services\ClusSvc\Parameters\Rhs

SUCCESS

 

clussvc.exe

2652

3172

CreateFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened

clussvc.exe

2652

3172

QueryBasicInformationFile

C:\Windows\Cluster\rhs.exe

SUCCESS

CreationTime: 10/18/2018 10:41:12 PM, LastAccessTime: 10/18/2018 10:41:12 PM, LastWriteTime: 10/18/2018 10:41:12 PM, ChangeTime: 10/18/2018 10:41:28 PM, FileAttributes: A

clussvc.exe

2652

3172

CloseFile

C:\Windows\Cluster\rhs.exe

SUCCESS

 

clussvc.exe

2652

3172

CreateFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened

  • Once all the Registries are created the Process RHS.exe started.

Process Name

PID

TID

Operation

Path

Result

Detail

clussvc.exe

2652

3172

Process Create

C:\Windows\Cluster\rhs.exe

SUCCESS

PID: 3252, Command line: C:\Windows\Cluster\rhs.exe -key SYSTEM\CurrentControlSet\Services\ClusSvc\Parameters\Rhs\13e895cf-2c3c-4781-9445-d1980ba124dd -parentPid 2652 -initEvent 23d84feb-96c3-42e4-ab47-34b65a6c7224 -replyEndpoint OLE4EF173EDDAA0C8380C69A5835C43

rhs.exe

3252

3172

Process Start

 

SUCCESS

Parent PID: 2652, Command line: C:\Windows\Cluster\rhs.exe -key SYSTEM\CurrentControlSet\Services\ClusSvc\Parameters\Rhs\13e895cf-2c3c-4781-9445-d1980ba124dd -parentPid 2652 -initEvent 23d84feb-96c3-42e4-ab47-34b65a6c7224 -replyEndpoint OLE4EF173EDDAA0C8380C69A5835C43, Current directory: C:\Windows\system32\, Environment:

rhs.exe

3252

3172

Thread Create

 

SUCCESS

Thread ID: 3256

  • Here we can see that the Various Modules which are loaded in the Process as RHS is dependent on the below DLLs.

 This can be seen with the Below Information in Procmon:

 

Process Name

PID

TID

Operation

Path

Result

Detail

rhs.exe

3252

3256

Load Image

C:\Windows\Cluster\rhs.exe

SUCCESS

Image Base: 0x7ff680ea0000, Image Size: 0x61000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\ntdll.dll

SUCCESS

Image Base: 0x7fff54de0000, Image Size: 0x1a9000

rhs.exe

3252

3256

ReadFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Offset: 340,992, Length: 16,384, I/O Flags: Non-cached, Paging I/O, Synchronous Paging I/O, Priority: Normal

rhs.exe

3252

3256

Load Image

C:\Windows\System32\kernel32.dll

SUCCESS

Image Base: 0x7fff53000000, Image Size: 0x13a000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\KernelBase.dll

SUCCESS

Image Base: 0x7fff52410000, Image Size: 0x10f000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\advapi32.dll

SUCCESS

Image Base: 0x7fff52ca0000, Image Size: 0xa5000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\msvcrt.dll

SUCCESS

Image Base: 0x7fff52bf0000, Image Size: 0xa7000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\rpcrt4.dll

SUCCESS

Image Base: 0x7fff52ab0000, Image Size: 0x136000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\clusapi.dll

SUCCESS

Image Base: 0x7fff4ced0000, Image Size: 0x6b000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\resutils.dll

SUCCESS

Image Base: 0x7fff4cf40000, Image Size: 0x49000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\wer.dll

SUCCESS

Image Base: 0x7fff4a970000, Image Size: 0x83000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\user32.dll

SUCCESS

Image Base: 0x7fff533a0000, Image Size: 0x171000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\sscore.dll

SUCCESS

Image Base: 0x7fff4dac0000, Image Size: 0xe000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\sechost.dll

SUCCESS

Image Base: 0x7fff53140000, Image Size: 0x57000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\cryptdll.dll

SUCCESS

Image Base: 0x7fff51b30000, Image Size: 0x18000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\combase.dll

SUCCESS

Image Base: 0x7fff54b90000, Image Size: 0x1d7000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\gdi32.dll

SUCCESS

Image Base: 0x7fff531a0000, Image Size: 0x145000

rhs.exe

3252

3256

Load Image

C:\Windows\System32\bcrypt.dll

SUCCESS

Image Base: 0x7fff51c50000, Image Size: 0x26000

 

· Second RHS Process is Initialized:

 

1422 00000c44.00000c48::2018/10/18-19:17:47.385 INFO [RHS] Initializing.

 

· Same information can be seen in Procmon:

 

Process Name

PID

TID

Operation

Path

Result

Detail

rhs.exe

3296

3172

Process Start

 

SUCCESS

Parent PID: 2652, Command line: C:\Windows\Cluster\rhs.exe -key SYSTEM\CurrentControlSet\Services\ClusSvc\Parameters\Rhs\9b2b36e0-5461-4299-b5b9-af8bbc1418e2 -parentPid 2652 -initEvent f77b61d3-5014-4886-b8c1-45112b39dd67 -replyEndpoint OLE4EF173EDDAA0C8380C69A5835C43, Current directory: C:\Windows\system32\, Environment:

rhs.exe

3296

3172

Thread Create

 

SUCCESS

Thread ID: 3300

clussvc.exe

2652

3172

QuerySecurityFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Information: Owner, Group, DACL, SACL, Label

clussvc.exe

2652

3172

QueryNameInformationFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Name: \Windows\Cluster\rhs.exe

clussvc.exe

2652

3172

QueryBasicInformationFile

C:\Windows\Cluster\rhs.exe

SUCCESS

CreationTime: 10/18/2018 10:41:12 PM, LastAccessTime: 10/18/2018 10:41:12 PM, LastWriteTime: 10/18/2018 10:41:12 PM, ChangeTime: 10/18/2018 10:41:28 PM, FileAttributes: A

csrss.exe

332

380

QuerySecurityFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Information: Owner, Group, DACL, SACL, Label

csrss.exe

332

380

QueryNameInformationFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Name: \Windows\Cluster\rhs.exe

clussvc.exe

2652

3172

QuerySecurityFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Information: Owner, Group, DACL, SACL, Label

clussvc.exe

2652

3172

QueryNameInformationFile

C:\Windows\Cluster\rhs.exe

SUCCESS

Name: \Windows\Cluster\rhs.exe

 

· The Cluster Creation Process is Finally Completed.

 

I hope this information will help you in understanding Clustering in a Better way J

 

Ashutosh Dixit

I am currently working as a Senior Technical Support Engineer with VMware Premier Services for Telco. Before this, I worked as a Technical Lead with Microsoft Enterprise Platform Support for Production and Premier Support. I am an expert in High-Availability, Deployments, and VMware Core technology along with Tanzu and Horizon.

Leave a Reply