Thursday, December 1, 2011

[OpsMgr 2007] Timeout running Remove-DisabledMonitoringObject cmdlet in System Center Operations Manager

I've read a lot and seen a lot of cases like mine - I mean remove-disabledmonitoringobject cmdlet get the error "The requested operation timed out" but it seems that nobody has been lucky to found how to fix the issue. We have openned a Micorsoft case on this issue a lot of monthes ago and since the begining of the week, we are now able to run the cmdlet without any error.

This resolution given by Microsoft for an openned case will not be part of the CU6.
The resolution has been found too late !


First, here is our configuration - approximatly 3000 agents in CU3. We don't have upgraded to CU4 and we soon upgrade to CU5. SQL 2008 for the DBs.

The first thinking of why the cmdlet is timing out was the number of overrides was too much for the SDK to process before the thirty minute WCF(Windows Connection Framework) timeout occurs.

Within our production environment there are approximately 140000 DiscoverySources which need analyse be the cmdlet to know if the associated discovered types need to be removed or not. We have a pre-production environnemnt with less number of agents and only 40000 DiscoverySources on wich the cmdlet is well working.

To reduce the number of Discovery sources we have analysed all the MP we have and it appeared that OCS MP was responsible for almost 40% of the hugh number of DiscoverySources. The OCS MP has nineteen discoveries targeted at Windows Server Computer class enabled by default. On each discoveries we have an override on a group to disable disovery for the group members. A discoverysource entry is created for each discovery-to-target-entity mapping.
We have also :  19 * ~3000 agents = 57000 discoverysources just for OCS MP.
When overrides are done one theses discoveries, the enabled states must calculated for all discoverysources !

I've worked to remove some overrides and also to reduce the enabled state calculation for the DiscoverySources but the cmdlet was always timed out.

The next way to fix the issue was to let Microsoft have an other review of the code and SQL involved to see if they can make some efficiencies in the way they do this. I've also been asked to run 2 queries on the SCOM Database :
I’ve also run the following queries :

  1. SELECT COUNT (Distinct [DiscoverySource].[DiscoverySourceId])
  2. FROM dbo.DiscoverySource
  3. INNER JOIN dbo.ModuleOverride ON ModuleOverride.ParentId = DiscoverySource.DiscoveryRuleId
  4. AND ModuleOverride.OverrideableParameterId = dbo.fn_MPObjectId(NULL, NULL, N'Enabled')
  5. AND (ParentType = 'Discovery' OR ParentType = 'Rule')
  6. join DiscoverySourceToTypedManagedEntity dstme
  7. on discoverysource.DiscoverySourceId = dstme.DiscoverySourceId
  8. WHERE DiscoverySource.IsDeleted = 0
  9. AND ModuleOverride.Value = 'false'

  1. SELECT COUNT (Distinct [DiscoverySource].[DiscoverySourceId])
  2. FROM dbo.DiscoverySource
  3. INNER JOIN dbo.ModuleOverride ON ModuleOverride.ParentId = DiscoverySource.DiscoveryRuleId
  4. AND ModuleOverride.OverrideableParameterId = dbo.fn_MPObjectId(NULL, NULL, N'Enabled')
  5. AND (ParentType = 'Discovery' OR ParentType = 'Rule')
  6. join DiscoverySourceToTypedManagedEntity dstme
  7. on discoverysource.DiscoverySourceId = dstme.DiscoverySourceId
  8. WHERE DiscoverySource.IsDeleted = 0
Given the numbers returned by the queries Microsoft support suspects the step below will allow the cmdlet to complete and await your results.
Here is also what I've been asked to do :
  • Run the SQL against the OperationsManager DB.

  1. DECLARE @querydef XML
  2. SET @querydef =
  3. N'<QueryDefinitions xmlns="urn:DataAccess" xmlns:dal="urn:DataAccess" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:DataAccess QueryDefinition.xsd">
  4. <QueryDefinition>
  5.   <Name>DiscoverySourcesEligibleForDeletionDueToOverrides</Name>
  6.   <ObjectName>DiscoverySourcesEligibleForDeletionDueToOverrides</ObjectName>
  7.   <UsedBy Component="Sdk" />
  8.   <Description>Selects discovery sources that *may* be invalid due to applied overrides.</Description>
  9.   <DataObject xsi:type="SelectType">
  10.     <Column>
  11.       <Name>DiscoverySourceId</Name>
  12.       <Source>DiscoverySource</Source>
  13.       <Type>uniqueidentifier</Type>
  14.     </Column>
  15.     <Column>
  16.       <Name>DiscoverySourceType</Name>
  17.       <Source>DiscoverySource</Source>
  18.       <Type>tinyint</Type>
  19.       <EnumType>Microsoft.EnterpriseManagement.Mom.Modules.DataItems.Discovery.DiscoverySourceType</EnumType>
  20.       <EnumLeastValue>Rule</EnumLeastValue>
  21.       <EnumGreatestValue>ConfigService</EnumGreatestValue>
  22.     </Column>
  23.     <Column>
  24.       <Name>DiscoveryRuleId</Name>
  25.       <Source>DiscoverySource</Source>
  26.       <Type>uniqueidentifier</Type>
  27.     </Column>
  28.     <Column>
  29.       <Name>BoundManagedEntityId</Name>
  30.       <Source>DiscoverySource</Source>
  31.       <Type>uniqueidentifier</Type>
  32.     </Column>
  33.     <Argument>Distinct</Argument>
  34.     <Source>
  35.       <Table>
  36.         <Name>DiscoverySource</Name>
  37.         <Owner>dbo</Owner>
  38.         <Type>Table</Type>
  39.       </Table>
  40.       <Join>
  41.         <Type>Inner</Type>
  42.         <Table>
  43.           <Name>ModuleOverride</Name>
  44.           <Owner>dbo</Owner>
  45.           <Type>Table</Type>
  46.         </Table>
  47.         <JoinCondition>ModuleOverride.ParentId = DiscoverySource.DiscoveryRuleId AND ModuleOverride.OverrideableParameterId = dbo.fn_MPObjectId(NULL, NULL, N''Enabled'') AND (ParentType = ''Discovery'' OR ParentType = ''Rule'')</JoinCondition>
  48.       </Join>
  49.       <Join>
  50.         <Type>Inner</Type>
  51.         <Table>
  52.           <Name>DiscoverySourceToTypedManagedEntity</Name>
  53.           <Owner>dbo</Owner>
  54.           <Type>Table</Type>
  55.         </Table>
  56.         <JoinCondition> discoverysource.DiscoverySourceId = DiscoverySourceToTypedManagedEntity.DiscoverySourceId</JoinCondition>
  57.       </Join>
  58.     </Source>
  59.     <Conditional>
  60.       <Condition>
  61.         <Expression>DiscoverySource.IsDeleted = 0 AND ModuleOverride.Value = ''false''</Expression>
  62.       </Condition>
  63.     </Conditional>
  64.   </DataObject>
  65. </QueryDefinition>
  66. </QueryDefinitions>'
  67. INSERT INTO dbo.[DataAccessLayerSetting]([SettingType], [SettingData]) VALUES (0, @querydef)



The impact of this change is when the cmdlet is run, it will now use the SQL query inserted in the Data.AccessLayerSetting table instead of the Original SQL query which is compiled into one dll. This table allows us to override the in-built queries, as it is read on OMSDK service restart.Coming back to the origicnal SQL is very easy, just remove the entry in the Data.AccessLayerSetting table and restart the SDK.

  • Restart the OpsMgr SDK service.
  • Run the remove-disabledmonitoringobject cmdlet and report back on the success or failure of it.
I use to launch the cmdlet like this (in a short PS1):
  1. get-managementserver | select ManagementGroup -unique
  2. get-date
  3. remove-disabledmonitoringobject
  4. get-date
That permit to show in the same few line the OpsMgr group, the dates before and after the cmdlet has run.


 Unfortunatly the 2 first time I've launched the remove-disabledmonitoringobject cmdlet, it went to a new error :

  1. >get-date
  2. Monday, November 28, 2011 8:27:16 AM
  3. PS Monitoring:\
  4. >remove-disabledmonitoringobject
  5. Remove-DisabledMonitoringObject : Microsoft.EnterpriseManagement.Common.DiscoveryDataFromRuleTargetedToDeletedMonitoringObjectException: Discovery data has been received from a rule targeted at a non-existent monitoring object id.
  6. MonitoringObjectId: c1537246-ec53-cc7c-b45f-aed87f06bc7f
  7. RuleId: 66b6d462-535f-cab6-eb14-b24fc79dfb75
  8.    at Microsoft.EnterpriseManagement.DataAbstractionLayer.InstanceSpaceOperations.DeleteDisabledDiscoverySources()
  9.    at Microsoft.EnterpriseManagement.ManagementGroup.DeleteDisabledMonitoringObjects()
  10.    at Microsoft.EnterpriseManagement.OperationsManager.ClientShell.RemoveDisabledMonitoringObjectCmdlet.ProcessRecord()
  11. At line:1 char:32
  12. + remove-disabledmonitoringobject <<<<
  13.     + CategoryInfo          : InvalidOperation: (Microsoft.Enter...ingObjectCmdlet   :RemoveDisabledMonitoringObjectCmdlet) [Remove-DisabledMonitoringObject], Disc
  14.   overyDataFr...ObjectException    + FullyQualifiedErrorId : ExecutionError,Microsoft.EnterpriseManagement.Operat
  15.    ionsManager.ClientShell.RemoveDisabledMonitoringObjectCmdlet
  16. PS Monitoring:\
  17. >get-date
  18. Monday, November 28, 2011 8:48:55 AM
I've also report this again to the microsoft support and wait any answer. That particular seems to be known and the cmdlet completed without error on the second run. In my case, the second run of the cmdlet found a different object which caused again the error and as in the first run it set the original object’s IsDeleted property to 1. It appears that the cmdlet attempts remove some objects twice, and it's failing on the second attempt.
I've launched a third time the cmdlet and it ended in success ! 





This posting is provided "AS IS" with no warranties.

This posting is provided "AS IS" with no warranties.

No comments:

Post a Comment