Best Practices: Agent Remediation Tool

This is a proof of concept script consisting of a mix of PowerShell with some .NET for a GUI that can serve as an automated playbook for agent remediation.

Typically I prefer to remediate agents via the SCOM console, but there are instances where an agent is locked down such that remote management is not possible, and the SCOM Team may not have access to remote a server and fix an agent. This script empowers non SCOM sysadmins, DBA’s et cetera to be able to perform basic troubleshooting on their agents without the fear of accidentally deleting the wrong thing.

-The script must be run in PowerShell as an administrator

-This script is compatible with PowerShell 2.0, 3.0 & 4.0

*There are some dependencies on the .NET Framework. It is designed for .NET 3.5, but in testing it does work with .NET 2.0, though it will throw some fun red errors for certain non critical display elements which will not be able to load.

This script is designed for SCOM 2012 R2 Agents

opsconfigtool

Functionality:

1. Restart SCOM Agent (This restarts the Microsoft Monitoring Agent)

2. Flush SCOM Agent Cache (This stops the SCOM agent, Clears the Health Service State Agent Caches, Starts the agent and rebuilds the Agent Cache)

3. Uninstall SCOM Agent (This queries WMI to determine appropriate GUID that is associated with the SCOM agent installation and then passes this GUID to an automated uninstall.)

4. Install SCOM Agent (This is a placeholder for either manual agent install instructions, or it can be adapted to call a function to kick off a command-line based agent install assuming agent media is on an accessible UNC file share)

*It appears I neglected to include the link to the script, it can be downloaded here: TechNet Gallery *

 

Tagged , ,

Audio: How Not to pitch a Billionaire (Episode I)

Something that I have always found fascinating is how are companies formed? The path from great idea, entrepreneurial spirit to success or failure. The below is the first episode in an interesting startup podcast founded by Alex Blumberg of This American Life & Planet Money fame. Give it 20 minutes of your time– I think you will enjoy.

And if you do then go check out the whole series here:

http://hearstartup.com/

Tagged , , , ,

Comic Relief: PowerShell 5.0

powershell

Happy Friday All!


The original unaltered Comic can be found here: Abstruse Goose (Please be mindful of AG’s CCommons license)

Tagged , ,

Best Practices: SCOM Health Check Script/Report OpsConfig ed v1.0

This weekend I came across a fantastic SCOM HealthCheck Script/Report written by Tim Culham of http://www.culham.net

I would strongly encourage you to visit the site and check out his original script as he did all the heavy lifting.

http://www.culham.net/powershell/scom-2012-scom-2012-r2-daily-check-powershell-script-html-report/

I decided to extend/tweak his script a bit by adding in a number of the more in depth SQL Queries that I frequently ask customers to run when troubleshooting performance issues with the OpsDB and DW. Many of the queries are modified versions of the KH Useful SQL Queries, though there are a few that might be new to all of you. This sacrifices some of the speed and elegance of Tim’s script, but the information that you get back is invaluable.

This script should be run as administrator from a SCOM Management Server by an account that has permissions to connect and read from the Ops & DW DBs. You can just run the script without inputting any parameters. It will open the report upon script completion. (My version can take anywhere from 30 seconds to 10 minutes to run depending on the size/performance of your environment)

*At times this script is running queries directly against the OpsDB -while this is a completely common practice for troubleshooting and diagnosing issues it is also technically not supported. The script is provided AS-IS without warranty of any kind*

My version of the script can be downloaded here:

https://gallery.technet.microsoft.com/SCOM-Health-Check-fd2272ec

What this version of the script will give you:(Some of these are just features which are carried over from the original, many are added)

01. Version/Service Pack/Edition of SQL for each SCOM DB Server
02. Disk Space Info for Ops DB, DW DB, and associated Temp DB’s
03. Database Backup Status for all DB’s except Temp.
04. Top 25 Largest Tables for Ops DB and DW DB
05. Number of Events Generated Per Day (Ops DB)
06. Top 10 Event Generating Computers (Ops DB)
07. Top 25 Events by Publisher (Ops DB)
08. Number of Perf Insertions Per Day (Ops DB)
09. Top 25 Perf Insertions by Object/Counter Name (Ops DB)
10. Top 25 Alerts by Alert Count
11. Alerts with a Repeat Count higher than 200
12. Stale State Change Data
13.  Top 25 Monitors Changing State in the last 7 Days
14. Top 25 Monitors Changing State By Object
15. Ops DB Grooming History
16. Snapshot of DW Staging Tables
17. DW Grooming Retention
18. Management Server checks (Works well on prem, seems to have some issues with gateways due to remote calls-if you see some errors flash by have no fear though I wouldn’t necessarily trust the results coming back from a Gateway server in the report depending on firewall settings)
19. Daily KPI
20. MP’s Modified in the Last 24 hours
21. Overrides in Default MP Check
22. Unitialized Agents
23. Agent Stats (Healthy, Warning, Critical, Unitialized, Total)
24. Agent Pending Management Summary
25. Alert Summary
26. Servers in Maintenance Mode

Report Output: (Only grabbing a screenshot of the first few pages as you get the basic idea)

Report Output

report output 2

 

 

Tagged ,

Talks: Tech Ed Europe 2014 Keynote

Also be sure to check out other talks as they are uploaded as well as the Live Stream:

http://channel9.msdn.com/Events/TechEd/Europe/2014

 

Troubleshooting: SCOM reports yield weird data/what the heck does 9.221E+07 mean?

Eventually when running a report in SCOM you are going to end up with a report like the one below.

01

At first glance everything looks okay. But then you start looking at the data that was returned and it can sometimes be a little confusing.

02

Usually the questions I get from customers ranges from “I think this report is broken” to “what the heck does 9.221E+07 mean?”

Fear not, reporting is not broken and 9.221E+07 is not nearly as confusing as it may seem. Basically, what is going on is that the dataset you have returned is so large in regards to the number of digits that in order to display it in a meaningful way the system is presenting the data using some shorthand commonly known as scientific notation. All you need to understand is that +07 indicates the number of times the decimal point would need to be moved to the right to display the full number.

So 9.221E+07 = 92210000

And if we look at the top of the chart we will note that the particular performance counter that we are reporting on is being returned in Bytes so we are dealing with:

92210000 Bytes

For those of you who like me are not particularly mathematically inclined and prefer to leave conversions to someone else I recommend using the wonderful built-in functionality of PowerShell.

If you enter 9.221E+07 and hit enter it will automatically output the full value for you:

03

If the original unit–in this case Bytes–is not your unit of choice and you want to know what the value is in MB  just enter the value in scientific notation form and then divide by 1 MB:

9.221E+07  / 1MB

04

Same goes for GB

9.221E+07  / 1GB

05

 

Tagged , , , , , ,