Recently I
worked on an issue: after a reboot of one of the cluster nodes, virtual
machines couldn’t migrate back on this node anymore. Cluster events log
contained some errors like these ones:
Cluster resource 'SCVMM pxe Configuration' of type
'Virtual Machine Configuration' in clustered role 'pxe' failed.
Based
on the failure policies for the resource and role, the cluster service may try
to bring the resource online on this node or move the group to another node of
the cluster and then restart it. Check
the resource and group state using Failover Cluster Manager or the
Get-ClusterResource Windows PowerShell cmdlet.
The
Cluster service failed to bring clustered service or application 'pxe'
completely online or offline. One or more resources may be in a failed state.
This may impact the availability of the clustered service or application.
I checked
ClusterStorage folder and it turned out that there were three ClusterStorage folders
with suffixes 000 and 001.
It all looked like a good reason to dug in to a cluster log
00000cfc.000017c8::2013/11/01-20:00:36.081
INFO [DCM] Cluster Shared Volume Root is
C:\ClusterStorage
00000cfc.000017c8::2013/11/01-20:00:36.081
INFO [DCM]
UpdateClusDiskMembership(enter): nodeSet (1 2 3)
00000cfc.000017c8::2013/11/01-20:00:36.081
INFO [DCM] CsvFs Listener already
started...
00000cfc.000017c8::2013/11/01-20:00:36.081
INFO [DCM] CsvFlt Listener already
started...
00000cfc.000017c8::2013/11/01-20:00:36.081
INFO [DCM] NFlt Listener already
started...
00000cfc.000017c8::2013/11/01-20:00:36.081
INFO [DCM] DeleteCsvShare: remove csv
blockstream C:\ClusterStorage:{db19d832-b034-46ed-a6c5-61e0ebe370d1}
00000cfc.000017c8::2013/11/01-20:00:36.081
WARN [DCM] Failed to delete csv share
CSV$ status 2310
00000cfc.000017c8::2013/11/01-20:00:36.097
WARN [DCM] rename attempt
C:\ClusterStorage => C:\ClusterStorage.000, status 183
00000cfc.000017c8::2013/11/01-20:00:36.113
WARN [DCM] Renamed existing
C:\ClusterStorage to C:\ClusterStorage.001
00000cfc.000017c8::2013/11/01-20:00:36.128
INFO [DCM] CreateRootDirectory: keeping
open handle HDL( bb4 ) to CSV root
00000cfc.000017c8::2013/11/01-20:00:36.128
INFO [DCM] create CSV stream file
C:\ClusterStorage:{db19d832-b034-46ed-a6c5-61e0ebe370d1}
Then I
checked EMC PowerPath – and it contained some dead path to our old SAN array. I
deleted them, stopped cluster service on the node, and deleted ClusterStorage.000
and .001 folders. Then I started cluster service again. Issue resolved!
Another quite similar issue once happened with our file cluster - again, the culprit was an old csv record that was not deleted correctly.
So, if you'll face similar issues, all you need to do is to delete unnecessary clusterstorage folders when cluster service is stopped and delete obsolete links to old array in your multipath software so that it won't be accidentally recreated.
Hope that this will be helpful for you.