Monthly Archives: August 2015

Reading List: CChamp/SKY BLUE!!!

Two things you need to know about Charles:

1. He is one of the best resources on Unix/Linux monitoring with SCOM outside of Kris Bash & and the PG.

2. He has an awesome new MP he just released that fixes the age old problem of Monitor based alerts being closed while the monitor is still unhealthy. I have seen customers solve this in the past with Green Machine, but the problem with Green Machine is unless it is applied carefully it becomes a state churn monster in your environment. Sky Blue has some slick PowerShell filtering logic to go only after those pesky monitors/alerts that need it while leaving the rest alone.

Check it out:

SKY BLUE  “management asking about “missing alerts” always feels like a rainy day.” 

Reading List: ScoMurr

Scott is a colleague at MSFT, and a SCOM expert with a development background, thus making him my go to source when I have a tough visual studio authoring questions. His blog covers a range of topics, and is worth looking at:

http://sc.scomurr.com/

I learned more about SCOM authoring with Scott in an hour Lync call than I have watching all the authoring videos that are available online.

Reading List: SCOMzilla/scom_atlas

Continuing in this week’s series of blogs that you may not know about that you should read I present Matt T’s blog:

http://blogs.technet.com/b/scom_atlas/

Best known for his awesome scheduled maintenance mode SCORCH runbook solution Matt T is an excellent source of SCOM/SCORCH related knowledge.

Tagged , , ,

Reading List: The OMX Blog

For whatever reason there are a handful of amazing SCOM blogs that have never gotten the attention that they deserve. I am going to call out a few over the next week, but I wanted to start out with this one:

http://blogs.technet.com/b/omx/

There aren’t a ton of posts, but it is well worth subscribing to as each post represents a truly unique contribution to SCOM monitoring knowledge. I have had the pleasure of sitting in a room and listening to Michael Repperger talk about SCOM and monitoring two times now and each time I learn an enormous amount. Last week I sat and watched as his session end time rolled past and while it was clear his staff had places to go, a handful of us would have been happy to order dinner and keep him talking for as long as possible. He is without question one of the brightest minds working with SCOM.

How do I: Alert on SQL Errors that aren’t logged to the windows event log

This is one of those common questions that if you ask a SQL DBA they will probably know the answer, but it is less common information within the SCOM community.

First if you want to get a sense of all the errors that SQL can generate to its own internal logs run the following from your server (Language ID will of course vary):

01

For my SQL 2014 Server I am getting back 11548 rows of messages:

02

 

Column name Data type Description
message_id int ID of the message. Is unique across server. Message IDs less than 50000 are system messages.
language_id smallint Language ID for which the text in text is used, as defined in syslanguages. This is unique for a specified message_id.
severity tinyint Severity level of the message, between 1 and 25. This is the same for all message languages within a message_id.
is_event_logged bit 1 = Message is event-logged when an error is raised. This is the same for all message languages within a message_id.
text nvarchar(2048) Text of the message used when the corresponding language_id is active.

For the most part the SQL MP’s will give you access to any of the events you might care about in both the SQL and Windows Application event logs. In those cases where this doesn’t happen there is a built in stored procedure in SQL that lets you write SQL errors to the Windows Application log to allow you to pick it up in other systems like SCOM.

sp_altermessage

If you dive into the code for the SQL replication MP’s you will find that this is how replication monitoring is implemented in SCOM. A series of sp_altermessage commands for different replication errors to turn on logging to the app log. Followed by corresponding event ID targeted alert generating rules.

03

https://msdn.microsoft.com/en-us/library/ms175094.aspx

The effect of sp_altermessage with the WITH_LOG option is similar to that of the RAISERROR WITH LOG parameter, except that sp_altermessage changes the logging behavior of an existing message. If a message has been altered to be WITH_LOG, it is always written to the Windows application log, regardless of how a user invokes the error. Even if RAISERROR is executed without the WITH_LOG option, the error is written to the Windows application log.

If a message is written to the Windows application log, it is also written to the Database Engine error log file.

Tagged , , , ,

Troubleshooting: Database Status Monitor generates warning for secondary database in log shipped pair

This is a question that I worked on for one of my customers awhile back, but I was giving a talk on extending the SQL MP this week out in Redmond so it seems like a good time to get this one on the blog.

The root of this problem is that the Database Status Monitor in the SQL Management Packs is not Log Shipping aware. For those who don’t live and breath SQL the basic definition of Log Shipping lifted from the MSFT documentation is:

SQL Server Log shipping allows you to automatically send transaction log backups from a primary database on a primary server instance to one or more secondary databases on separate secondary server instances.

-Provides a disaster-recovery solution for a single primary database and one or more secondary databases, each on a separate instance of SQL Server.

-Supports limited read-only access to secondary databases (during the interval between restore jobs).

-Allows a user-specified delay between when the primary server backs up the log of the primary database and when the secondary servers must restore (apply) the log backup. A longer delay can be useful, for example, if data is accidentally changed on the primary database. If the accidental change is noticed quickly, a delay can let you retrieve still unchanged data from a secondary database before the change is reflected there.

The Database Status Monitor can return the following possible states within SQL:

01

Which roll up under the following three health states:

02

So in the case of Log Shipping the issue is that for the secondary database “Restoring” is normal behavior for a log shipped DB, but it would not be typical behavior for a standard database. So you get Warning Alerts because the monitor isn’t smart enough to detect the difference between a log shipped and non log shipped DB.

To understand this better it can help to look at the history of the embedded T-SQL code in the MP for this monitor. The original code in the SQL MPs looked as follows:

03

04

It is very simple, but it isn’t mirroring or log shipping aware.

Then in later revisions of the SQL MP Mirroring Awareness was added to the query. This was first added for just SQL 2008, but then around the 6.5+ releases all versions of the SQL MP were made Mirroring aware with the exception of the SQL 2005 MPs via the following code modifications:

05 06 07

This code + a little XPathQuery logic at the end of the MP fixes the false warning alerts for mirroring, but a secondary log shipped database still shows up as a Warning today in the current release of the SQL MP.

I will put up the full sample code for the custom monitor soon on TechNet Gallery, but the basic modifications are as follows:

Add a left outer join for msdb.dbo.log_shipping_secondary_databaseses

09

Add the additional property bag value

10

Modify the XPathQuery so that -IsLogShipping is a Healthy condition.

11 12 13

And then you have a nice custom LogShipping & Mirroring aware Database Status Monitor. I tend to isolate the monitor and any dependencies into a standalone custom MP and then disable the out of box Monitor.

Tagged , , ,