Comic Relief: RSA-2048

497

Comic Credit: Abstruse Goose

Happy Friday!

Tagged

How do I: Change the default behavior of the DB Mirror Status Monitor

MP Authoring Series (For an explanation of this series read this post first.)

Real World Issue: Customer is seeing a lot of Critical alerts for Mirrored Databases in a Disconnected state, but only Warning alerts for Mirrored Databases in Suspended state. In this customer’s environment brief disconnects are common and not necessarily indicative of a Critical issue, whereas a mirrored database in a Suspended state is always Critical for this customer. The customer wants to swap the default states such that Disconnected will now be Warning and Suspended will be Critical.

There are three Mirrored Database Mirror Status Monitors

01 02

When we dig into the properties

2.5

We see the three states as well as the corresponding Statuses that map to those states.

03

When we check Overrides we find there is nothing we can override to meet the customer needs:

04

So to accommodate this request we need to do a little custom authoring in Visual Studio + VSAE.

The process isn’t too complex, however, it is much easier to absorb via video than the many page article that would result if I tried to document the steps by hand:

Once you have your final successful build you will find your files in the bin–Debug folder of your project:

05

Once you import your custom MP into SCOM you will have a cloned monitor with your modified behavior:

06

 

Tagged , , ,

SCOM Management Pack Authoring Training: A different approach (Part I)

It would be wrong to start off any discussion of SCOM authoring without pointing out some of the great resources that do exist on this topic:

-The MSDN Operations Manager Development Kit  (Excellent for those of the developer persuasion, but less friendly for those whom PowerShell is much more comfortable than C#)

-Steve Wilson’s classic  AuthorMP blog posts. (A bit dated, but they remain a source of some the most insightful posts on various aspects of OpsMgr)

-Jonathan Almquist’s posts both in his old, and new home

-Brian Wren has put together great video content + some awesome posts over the years

-Graham Davies and the Manageability Guys in the UK have some awesome posts: 1, 2 , 3,  4, 5, 6, 7, 8

-If you happen to be a Microsoft Premier customer there is a great workshop on SCOM Authoring with Visual Studio that came out last year

-And countless members of the community like Tao Yang &  Raphael Burri who have written high quality MPs that can serve as a primer to those who want to dig in and start authoring. (Since I started writing this post awhile back I believe Tao has hosted some MP authoring training, I haven’t gotten a chance to look at it yet, but once I do I will add a link.)

-There are of course many other worthwhile posts throughout the SCOM community both inside and outside of Microsoft, but that is what Bing, Google, and DuckDuckGo are for.

But despite all that great content out there. Management Pack authoring can be an extremely difficult skill to acquire. At least from my own perspective – starting out in MP Authoring was really hard. Even after having read the vast majority of the published info on MP authoring, and watching all the videos that are out there I can’t say that I felt particularly confident to wade into Visual Studio and start writing management packs. I understood the basic mechanics, but I lacked the ability to fill in the inevitable gaps of knowledge to be able to author custom MPs that met real enterprise level business needs.

Unfortunately, a lot of the best how-to examples and step-by-step tutorials tend to be a little generic. I suspect this is done intentionally to minimize complexity as much as possible. The hope being that a budding MP author can learn the fundamentals and then extrapolate from the excellent guide on “how to author a custom MP that no one would ever import into their real environment”, and later apply this knowledge to some real-world problem.

My brain tends to not work that way. It is easily distracted by shiny objects and Wikipedia. To learn I need concrete real-world examples, problems, deadlines.  For me the most valuable training in MP authoring came not from all the guides and links referenced above, but from a single one hour Lync call with a senior colleague at Microsoft. I had a specific question that I didn’t know how to solve and step-by-step over the course of an hour he walked me through how to build an MP that addressed that request.

I really wish there were a ton of Authoring videos like that call out there following the simple formula:

Real-world enterprise monitoring problem that is not currently addressed by a Management Pack + Screen capture video that walks through the process every step of the way.

Sadly as fun as it would be to simply complain, I think there might be some small value in me adding what little I know about MP authoring to the general ether following the format above.

I had toyed with the idea of doing Livecoding.tv or live Twitch.tv sessions, but my home internet connection these days is DSL so the upload streaming experience is lacking, so these will be pre-recorded sessions with some light editing.

The first installment can be found here:

MP Authoring 001

I have about 10 videos planned so far and if the first few are of any use to the community I will shoot for publishing a new video every two weeks until I run out of ideas.

Tagged , , ,

How Do I: Access SCOM Properties Programmatically

For some background. The question that led to this post was in regard to being able to access properties that SCOM was discovering in order to detect config drift in some networking hardware. Normally when I get a question like this my first answer is don’t use SCOM for this–use OMS, SCCM, or some other tool designed specifically for this purpose. With that said, they had a specific use case that made sense, and SCOM was already collecting all the properties they cared about as part of a 3rd party Management Pack so the primary goal became giving the customer a better picture of where this data gets stored and the easiest way to access it.

First way of getting at discovered property data is via the OperastionsManager Database (The usual caveats about directly querying the OpsDB not being recommended or supported apply.)

There are tables called Dbo.MT which contain the various properties associated with a certain class of object.

prop01

If I look at something like SQL 2014 Databases I find the following: (There are more properties, but they get truncated off screen)

Select * from dbo.MT_Microsoft$SQLServer$2014$Database

prop02

To make this a little more meaningful we need to pick which tables we are interested in and join FullName from BaseManagedEntityID so we can understand which systems these databases are associated with. For this I wrote the following query:

SELECT

BME.FullName,

MT.DatabaseName_3AD1AB73_FD77_E630_3CDE_2CA224473213 As ‘DB Name’,

MT.DisplayName,

MT.DatabaseAutogrow_E32D36C4_7E11_62BE_D5B4_B77C841DCCA1 As ‘DB Autogrow’,

MT.RecoveryModel_772240AD_E512_377C_8986_E4F8369BDC21 As ‘DB RecoveryModel’,

MT.LogAutogrow_75D233F6_0569_DB26_0207_8894057F498C As ‘LogAutogrow’,

MT.Collation_4BC5C384_34F3_4C3F_A398_2298DBA85BCD As ‘Collation’,

MT.BaseManagedEntityId

FROM dbo.MT_Microsoft$SQLServer$2014$Database MT

JOIN dbo.BaseManagedEntity BME On BME.BaseManagedEntityID  = MT.BaseManagedEntityId

Which gives this output:

prop03

You could also get at similar data through the SDK via PowerShell (This would technically be the officially supported technique, though sometimes not as flexible as SQL). To do this you would use something like:

Import-Module OperationsManager

$WindowsServerClass= Get-SCOMClass -Name Microsoft.SQLServer.2014.Database

$ServerObjects = Get-SCOMClassInstance -Class $WindowsServerClass | Select Fullname, *.DatabaseName,*.RecoveryModel,*.DatabaseAutogrow,*.LogAutogrow,*.Collation

$ServerObjects

This will give you results that look as follows: (I just arbitrarily picked a few properties, there are more available that you can look at with either I get-member or | Select *

prop04

From there we can make things a little more readable with the following:

Import-Module OperationsManager

$WindowsServerClass= Get-SCOMClass -Name Microsoft.SQLServer.2014.Database

$ServerObjects = Get-SCOMClassInstance -Class $WindowsServerClass

$ServerObjectsB = $ServerObjects | Select *.DatabaseName, *.RecoveryModel, *.DatabaseAutogrow, *.LogAutogrow, *.Updateability, *.UserAccess, *.Collation, *.Owner, *.ResourcePool | FT

prop05

From there we started playing around with ways to quickly identify differences:

prop06

This is still a work in progress, but I figured I would share in case this can be of use to anyone.

Tagged ,

How do I: Send SMS Text Message Notifications for Heartbeat Failures

Continuing in my series of interesting questions from last year and my answers here is one on Sending SMS Notifications for Heartbeat Failures for a subset of mission critical servers. The added wrinkle to this question was they also needed to be certain (due to the security requirements of their environment) that no information regarding servername, IP address, or other info of that nature which might be part of a typical alert description make it into the text alerts.

Text Alert on Heartbeat failures without Confidential information/Server names

SCOM Heartbeat Failure Chain of events:

01SMS

Above Diagram pilfered with attribution from TechNet.

First you need to setup a new E-Mail Notification Channel

Select Administration

02SMS

Channels:

03SMS

New

04SMS

Select E-Mail (SMTP)

05SMS

Enter a Channel Name:

06SMS

Enter a SMTP Server and a Return address (You will likely need an exception that will allow the SMTP server to send messages outside your domain)

07SMS

Modify the Subject and Message as follows:

E-mail subject:

Alert: $Data[Default=’Not Present’]/Context/DataItem/AlertName$ Resolution state: $Data[Default=’Not Present’]/Context/DataItem/ResolutionStateName$

E-mail Message:

Alert: $Data[Default=’Not Present’]/Context/DataItem/AlertName$

Last modified by: $Data[Default=’Not Present’]/Context/DataItem/LastModifiedBy$

Last modified time: $Data[Default=’Not Present’]/Context/DataItem/LastModifiedLocal$

(Ultimately you could add additional text here as well, the key is that we are pulling out the variables from the Channel that would normally populate the server name when there is a heartbeat failure)

08SMS

Click Finish

09SMS

Create a new Subscription

10SMS

Created by specific rules or monitors — Health Service Heartbeat Failure

With a specific resolution state–New

11SMS

Add subscribers (If you want it to send text messages you can create new unique subscriber and have an address that consists of the appropriate cell number + service provider combination:

Sprint

cellnumber@messaging.sprintpcs.com 

Verizon

cellnumber@vtext.com

T-Mobile

cellnumber@tmomail.net

AT&T

cellnumber@txt.att.net

For my example I am just using an internal account in my environment.

12SMS

Select your newly created notification channel. You may want to delay notifications by 15 minutes.  That way if the server is down for less than 15 minutes you won’t get a text message at 3 AM.

13SMS

Click Finish

14SMS

Now if a server goes offline the console will still generate an alert as before with the server name:

15SMS

But the e-mail or text message will be generic without any confidential information:

16SMS

For alerts other than heartbeat you might have to check and craft a slightly modified channel to insure no info you don’t want texted is sent out.

A quick example to illustrate this:

Ultimately $Data/Context/DataItem/AlertName$ will map to a different value for each type of alert. So for the alert below:

17SMS

That variable maps to:

18SMS

So Alert Name by itself will not map to anything proprietary like IP Address/domain/computername etc unless you have created a custom alert which contains any of this info in the Alert Name field. Though with that said it may still map to info about specific technologies. So one might be able to use the Alert Name to determine what types of applications you are running which could in some cases be a security concern. To get a sense of the type of values that typically show up in your environment the quick and easy method is to just look at your Monitoring Pane – Active Alerts  Name column:

19SMS

So for my environment you could learn from this info what apps I am running (SharePoint, SQL, ACS), in the case of the Page Life Expectancy you are able to find out the version of SQL etc. If this kind of info isn’t a security concern for your business you could just pass the Alert Name field from any alerts that meet a certain Severity/Priority Criteria. If this type of info is a concern then you need to determine which alerts are ok to pass alert name like Health Service Heartbeat failure and which need to be withheld and then filter your notification subscription criteria accordingly.

If you want a slighter better view of this info you could use PowerShell:

Import-Module OperationsManager

Get-SCOMAlert 

20SMS

Get-SCOMAlert | Select Name

This will give you possible values that could populate that variable. (Keep in mind this will only pull back values that are currently in the OpsDB so this will be all alerts in that DB based on your grooming/retention settings.)

 

Tagged , , ,

Talks: Jeffrey Snover – The Cultural Battle To Remove Windows from Windows Server

Fascinating talk from Jeffrey Snover on the road from the Monad Manifesto to PowerShell to Nano Server.

 

Tagged , , ,

How do I: Create a Wildcard SCOM Service Monitor and Recovery

I recently had a question from a customer on how to create  Wildcard Service Monitors + Recoveries. The Service Monitor from the Monitoring template and a simple Unit Monitor for services both require an explicit service name, no wildcards allowed. For most services this is fine, but there are some applications which do fun and interesting things like concatenate computername + service to create a unique service name. This creates a bit of a problem for monitoring. You can create individual monitors, but if you have hundreds of services each with unique service names that follow a particular pattern creating hundreds of corresponding monitors could get a little time consuming.

Brian Wren has a great article from back in the SCOM 2007 R2 days that answers part of this question, but when I went through the steps I found it needed some slight tweaking and updating for SCOM 2012 R2. Once that was complete I also needed to come up with a simple low overhead wildcard service recovery for when one of the services stops and needs to be brought back online.

Below are my steps:

Launch SCOM Console

Administration

01

Management Packs

02

Tasks – Create a Management Pack

03

Enter Name + Description – Next – Create

04

Select Authoring

05

Right Click Windows Service

08

Add Monitoring Wizard

09

Windows Service

10

Enter a Name, Save to the MP you just created

11

Enter Service Name. Use % for a wildcard representing multiple characters. As I don’t have any unique services in my environment I am using m% to demonstrate how this can work. For the rest of these instructions wherever you see m% keep in mind that you need to modify this value to match your unique service name wildcard value. Be careful using too broad a wildcard could create a lot of noise and load very quickly in your environment.

Pick a Target Group. In this case I am using All Windows Computers. Generally you would want to target this as precisely as possible. Leave Monitor only automatic service checked

12

Click Next

13

Click Create

14

Select Administration

01

Select Management Packs

02

Select the Custom Management Pack you just created

15

Select Export Management Pack

16

Select a location to save the unsealed xml file

17

Click OK

18

Open the File in your XML editor of choice (Notepad will do, but Visual Studio or Notepad+++ will make it a bit easier to read)

19

Search the file for your wildcard in my case this is M%

20

We’ll be making a few replacements in the code.

21

You will be modifying:

<DataSource ID=”DS” TypeID=”MicrosoftWindowsLibrary7585010!Microsoft.Windows.Win32ServiceInformationProviderWithClassSnapshotDataMapper”>

<ComputerName>$Target/Property[Type=”MicrosoftWindowsLibrary7585010!Microsoft.Windows.Computer”]/NetworkName$</ComputerName>

<ServiceName>m%</ServiceName>

 

To: (remember to also swap the m% with the appropriate value)

 

<DataSource ID=”DS” TypeID=” MicrosoftWindowsLibrary7585010!Microsoft.Windows.WmiProviderWithClassSnapshotDataMapper”>

<NameSpace>root\cimv2</NameSpace>

<Query>select * from win32_service where name like ‘m%'</Query>

_________

  • In Brian Wren’s instructions he used TypeID=”Windows!Microsoft,Windows.Win32…” The Alias in my custom console generated MP is MicrosoftWindowsLibrary7585010! If you run into any errors keep in mind that whatever alias is present in the manifest references must be consistent. I haven’t tested to confirm, but based on the output it looks like the console MP generated alias is based on MP+Version Number. If you have a different version of the MP and you follow my steps exactly you will likely hit an error as the Alias I provide for Microsoft.Windows.Library is going to be off by a few numbers from yours. If this is the case just modify the alias in my example to match what you have in the rest of the .xml file.

 

And

 

<Name>$MPElement[Name=”MicrosoftSystemCenterNTServiceLibrary!Microsoft.SystemCenter.NTService”]/ServiceProcessName$</Name>

<Value>$Data/Property[@Name=’BinaryPathName’]$</Value>

 

To:

 

<Name>$MPElement[Name=”MicrosoftSystemCenterNTServiceLibrary!Microsoft.SystemCenter.NTService”]/ServiceProcessName$</Name>

<Value>$Data/Property[@Name=’PathName’]$</Value>

Save the .xml file

Go back to the SCOM console – ADministration

01

Import Management Packs

22

Add from disk

23

Select the newly modified .xml file

24

Install

25

Close

26

To check and confirm that the discovery associated with the wildcard monitor is working.

Select Monitoring

27

Discovered Inventory

28

Change Target Type

29

Select Custom Target

30

A few minutes after importing the updated pack you should see services discovered.

31

Now we need to create a wildcard recovery. If this was a single service recovery I would create a standard SCOM recovery and call net.exe and pass a start command with the service name. Since this is a wildcard service we have to do things a little differently as I don’t know of a way to pass wildcards to net.exe. (We could use PowerShell, but for this I want to try to be as light weight as possible from an overhead perspective even if that means sacrificing some more advanced error handling that we could easily add in with PowerShell.)

Go to Authoring:

05

Select Windows Service

32

Right Click your custom Service Monitor – View Management Pack Objects – Monitors

33

Expand Entity Health -Availability – Right Click the Basic Service Monitor Stored in your custom MP – Properties

34

Select the Diagnostics and Recovery Tab

35

Under Configure recovery tasks select Add – Recovery for critical health state

36

Select Run Command

37

Name your Recovery – Check the Boxes for run recovery automatically and recalculate monitor state after recovery finishes

38

Enter Full path to file

c:\windows\system32\wbem\WMIC.exe

Parameters: (originally I used slightly different param, but found that while it worked in the command line it failed when run as recovery. This method works consistently)

/interactive :off service where “name like ‘m%'” call startservice

fix

Click Create

You should now be all set to test out and validate your new monitor.

EndNote/Cautionary Tangent:

Just keep in mind that a wildcard discovery if targeted incorrectly (too broad a wildcard, too broad a target group, or both) you could have the recipe for a single monitor that can cause a lot of churn/perf issues/and noise in your environment. So be cautious and test very carefully. Make sure you have a good sense of the number of objects this monitor will pick up not just in your test environment, but once you move it into production.  To be clear I would never recommend using a wildcard as broad as m% in production. This picks up way too many services that you likely don’t care about.

Also please note that the recovery is equally general as the monitor if not more so. It is also not checking to see if the services that apply to it are already started. In the case of my example m% picks up a bunch of services. If a single service matching that criteria goes down, the recovery will attempt to recover/start every single service that matches that criteria m%. So if you are building your wildcard service monitor to pickup multiple services on single system that follow a common pattern, a failure of one will result in an attempt to recover all.

In theory this shouldn’t be a problem. The method I am using is extremely lightweight and if the service is already started in the background the service will just output an exit code of “I’m already started” and remain started. With that said this is only a sample, and its still worth testing in your environment to confirm the behavior and make sure you understand exactly how the recovery is working before you consider implementing.

An example of running the recovery for an instance of m% being stopped:

40.1

 

Tagged , , , , ,

Reading List: CChamp/SKY BLUE!!!

Two things you need to know about Charles:

1. He is one of the best resources on Unix/Linux monitoring with SCOM outside of Kris Bash & and the PG.

2. He has an awesome new MP he just released that fixes the age old problem of Monitor based alerts being closed while the monitor is still unhealthy. I have seen customers solve this in the past with Green Machine, but the problem with Green Machine is unless it is applied carefully it becomes a state churn monster in your environment. Sky Blue has some slick PowerShell filtering logic to go only after those pesky monitors/alerts that need it while leaving the rest alone.

Check it out:

SKY BLUE  “management asking about “missing alerts” always feels like a rainy day.”