Tag Archives: Operations Manager

How do I: Monitor Azure AD Connect Sync with SCOM?

Back in the Fall I had a question regarding monitoring Azure AD Connect Sync with SCOM. The preferred solution is generally Azure AD Connect Health, and if you have SCOM you couple that with various on prem AD/ADFS Management Packs to monitor your hybrid environment end-to-end.

I love that our product teams who build cloud services are taking a proactive approach to monitoring and thinking about it as integral to the product development cycle. A part of me would love to see a Product Group Management Pack for Azure AD Connect Sync, but I also understand that in this new cloud first world that you have to focus your resources carefully, and sometimes that means developing solutions that can potentially benefit a broader pool of customers.

The biggest challenge that I have seen for some Hybrid Cloud customers is that the out-of-box built-in notification mechanism of these monitoring solutions is e-mail only. Many of the customers I work with have fairly advanced notification/ticketing systems, and while e-mail is one avenue of alerting, it isn’t the only one. For customers with SCOM, they have often put in the leg work of integrating SCOM with their notification system such that certain alerts are e-mails, others are tickets, and some might kick-off a page or text to wake-up an engineer at 2 AM.

With some cloud services I can understand the argument that the paging at 2 AM is going to happen on the Microsoft side, so your engineers can continue to sleep peacefully. But with a hybrid solution like Azure AD Connect Sync, that isn’t really the case. You can absolutely have a problem that only your engineers can fix, and you may want to have the flexibility to leverage your existing notification system. You could certainly explore integrating directly between your ticketing/notification system and Azure AD Connect Health, and for some customers this might be the correct path. No need to add an extra hop/point of failure if you don’t need to. But for those who have already invested heavily in SCOM, it would be nice to have a management pack that could provide basic integration with minimal development effort.

I had started poking around the problem in the Fall, but I hadn’t had time to sit down and write an MP to address it. It was basically a lot of pseudo code floating around in my head that I was pretty sure would work if I ever sat down and wrote it. I have a nice week of vacation ahead of me starting today, but I had promised some colleagues I would build an MP if I had some free time, so I spent this past weekend putting together a Management Pack that I believe should address this problem.

The MP is still very much in beta form, and it falls under the usual AS-IS/test heavily/use at your own risk disclaimer that accompanies all community based MPs. I am actively seeking feedback and will come out with additional versions as time allows, so if you have suggestions please feel free to send them my way. If you DM @OpsConfig on Twitter, or leave a comment I will respond via e-mail.

The core functionality of the MP is pretty simple. It makes an API call to your instance of Azure AD Connect Sync Health for alerts every 15 minutes . If there is a new warning alert it will generate a corresponding warning alert in SCOM. If there is a new critical alert it will generate a corresponding critical alert. If an alert closes in Azure AD Connect Health the MP will automatically detect the resolution and close out the Alert in SCOM. Nothing fancy, but it works and is pretty lightweight.

I also added in a custom class/monitor that looks for instances of the Microsoft Azure AD Sync Service:

AAD Connect Health will monitor this too, but it doesn’t monitor it as real-time as SCOM does. I would rather know within 60 seconds if Sync is down rather than having to wait, so it is a nice better together story to have this working in conjunction with Azure AD Connect Sync Health.

In addition the MP monitors the core services which feed Azure AAD Connect Sync Health:

Again if these services go down you will eventually be alerted by AAD Connect Sync Health, but why wait? Since these services are delayed start, I built a custom Unit Monitor Type that gives them a little more leeway so we check the service state every 30 seconds but unlike the default NT Service Unit Monitor Type we wait until we have 6 consecutive samples of service stopped detected before we alert. Since these monitors are tied to the class based on the presence of the Azure AD Sync service, they will also alert if you have a server with the sync service which doesn’t have the Azure AD Connect Health Sync Agent/services installed. (If this is an issue, you can always shut the monitors off, but without those services installed and running you are losing 95% of the functionality provided by this pack.)

To get started with the pack there are some prerequisites:

  • You need Azure AD Connect Health to be installed and configured. I won’t go into the details for that, but you can find everything you need to know via the awesome guide/videos which can be found here:

https://docs.microsoft.com/en-us/azure/active-directory/connect-health/active-directory-aadconnect-health

  • For Authentication I leverage the Active Directory Authentication Library (ADAL). The key components being these two .dlls:

If you download and install the Azure Powershell Module this should give you everything you need:

https://aka.ms/webpi-azps

You will need to install this on each of the management servers, as I leverage the All Management Servers Resource Pool as the source of API calls to allow for high-availability. (If having the ability to have a dedicated AAD Connect Health Watcher is more desirable than the AMSRP just let me know and I can make another version of the MP which can support this.)

Your management servers will need to allow communication to the following urls through both windows/network firewalls:

https://management.azure.com/
https://login.windows.net/

  • You will need a user account with necessary access to Azure AD Connect Health Sync.
  • When logged into https://portal.azure.com navigate to Azure Active Directory in the left hand pane.

Then select Azure AD Connect

Select Azure AD Connect Health

Right now you can see that my environment is unhealthy as I have intentionally stopped the Azure Active Directory Connect Sync Monitoring to force an error condition:

If you click Users – Add – select a role and add a user that we will later add to a Run As Profile in SCOM:

As this is still early in the testing phase I have lazily done most my testing with an account  with Owner privs. I believe Monitoring Reader Service Role should be sufficient (Subsequent testing shows that this works — see comments for details), but I need to do some more testing to insure that will always hold true.

There is one more prereq click Azure Active Directory Connect (Sync)

Then click the service name that you want to monitor:

Take note of the url in your browser bar as you will need to copy the small portion highlighted in yellow for an overridable parameter in SCOM:

Once you have all the above prerequisites in place you can download and import the MP from here:

Azure AD Connect Sync Custom MP

Once imported you will need to add your Azure AD Connect account configured above to a custom Run As Profile.

I use an account configured with Basic Auth that I then distribute to my management servers.

Once this is in place we need to modify the core rule that drives the MP:

Right-click Azure AD Connect Rule – Overrides – Override the Rule – For all objects of class: All Management Servers Resource Pool

Override AADSync URL (the portion of the url highlighted in yellow that you copied before) – Add your AdTenant – Set the rule to enabled.

Then any time an alert gets generated in Azure AD Connect Sync Health:

A corresponding alert will be generated in SCOM:

Once the alert closes in AAD Connect Sync Health it will close out in SCOM within 15 minutes.

When I get back from vacation I will put together a post or a video walking through the underlying mechanics of exactly how the MP works, and then I will most likely post the Visual Studio project files on GitHub. But in the meantime you are welcome to download and test it out from TechNet Gallery. Now I am off to my vacation. Cheers!

Tagged , , , , , ,

On The Importance Of Building Test Environments

One of the things I didn’t quite grasp when I first started using SCOM a few years back was the importance of test environments. SCOM was this bright and shiny new tool that was going to help proactively monitor our servers, increase uptime, and as long as I only installed Microsoft approved Management Packs everything would be alright. This was admittedly extremely naive– but it was good starting point. I was enthusiastic as well as fortunate enough to learn that this was a terrible idea long before making a critical mistake.

SCOM is an incredibly powerful tool, but it has to be used and implemented intelligently:

-Installation guides must be read.

-MP’s should be evaluated in Test or Dev environments first (If you don’t have a test environment build one)

-Blogs should be scoured for relevant info.

-Management Packs should be installed in production because they provide value not just because you happen to have the associated product installed.

Anytime an engineer or admin asks to have a shiny new management pack installed in production and doesn’t want to test it first I remember this slide from a talk I stumbled across from Microsoft’s Management Pack University entitled “Getting Manageability Right” given by Nistha Soni, a program manager on the Ops Manager team at Microsoft:

Getting Managability Right Nista Soni

The talk was for the different Microsoft product teams to help them think about how to build better management packs that are useful to their customers. If a MP reduces total cost of ownership this is a good thing, if it increases TCO then we have a problem. This slide was referencing an iteration of a Microsoft MP–name omitted to protect the guilty– which provided feedback that while potentially useful for a developer at Microsoft, was also inundating their customers/operators with alerts.

Building a useful MP is a delicate balancing act and its important to remember that even the ones made by Microsoft are essentially a work in progress. Each successive iteration tends to get better, but if you just import into production without testing and research you are asking for trouble.

The talk itself is an interesting look at how Microsoft thinks about monitoring and building management packs and is still available here.

Tagged , , , ,