Automatically Closing Old SCOM Alerts using PowerShell

If you a SCOM administrator, then you are familiar with seeing old alerts piling up in your SCOM console. No matter how much you clean up your monitoring environment, some alerts will be left over unless you manually remove them, depending on how you have configured alert resolutions.

Following PowerShell script from Microsoft SCOM blog will allow you to automatically close old SCOM alerts.The logic behind this script is, that it will traverse the active alerts and checks for alert age. If the alert age is greater than the specified number of days in the $alertsTobeClosedBefore variable, those alerts will be closed.

Remember to load the SCOM PowerShell Module before you run this script.

$alertsTobeClosedBefore = 5

$currentDate = Get-Date

Get-SCOMAlert | Where-Object {(($_.ResolutionState -ne 255) -and (($currentDate – $_.TimeRaised).TotalDays -ge $alertsTobeClosedBefore))} |Resolve-SCOMAlert

 

Service Tags for Azure Network Security Groups

While you are creating NSG rules have you ever faced a scenario where you need to group a bunch of IP addresses, sources or destinations? Service Tags in Azure denotes a group of IP address prefixes that can help you to simplify the NSG rule creation. Service Tags allow you to restrict (allow or deny) traffic to a specific Azure service both globally or per region.

Microsoft manages and updates the list of undelying IP address for each service tag thus eliminating the need to manually update your NSG rule each time the service endpoint IP addresses are changed. This allows you to leverage service tags insteadof  of specific IP addresses when creating security rules.

What’s available right now?

Following is the list of  service tags that are available to be used in a security rule definition.

Azure Service Tag Purpose
VirtualNetwork (Resource Manager) (VIRTUAL_NETWORK for classic): VNet address space (all CIDR ranges defined in the VNet), all connected on-premises address spaces, and peered VNets or VNets connected to a VNet gateway, are encompassed in this service tag.
AzureLoadBalancer (Resource Manager) (AZURE_LOADBALANCER for classic) If you are using any Internet facing Azure Load Balancer you can use this service tag. This tag translates to an Azure datacenter IP address where Azure’s health probes originate.
Internet (Resource Manager) (INTERNET for classic) Any ingress or egress traffic from IP addresses that are outside the VNet and reachable from the Internet, is encompassed in this service tag. The address range includes the Azure owned public IP address space as well.
AzureTrafficManager (Resource Manager only) The IP address space for the Azure Traffic Manager probe IPs is denoted by this service tags.
Storage (Resource Manager only) Azure Storage service IP range is denoted in this service tag so that once you specify Storage for the value, traffic is allowed or denied to storage. Furthermore, you can specify the region to allow or deny traffic to a storage in a specific region.This tag represents the Azure Storage service, but not specific instances of the service (storage accounts).
Sql (Resource Manager only) Address prefixes of the Azure SQL Database and Azure SQL Data Warehouse services are denoted in this service tag. Specifying Sql for the value allows you to allow or deny traffic to Sql. Regional restrictions are possible as in Storage service tag and also this tag represents the Azure SQL service not individual instances. (databases or data warehouses)

Here is an example of a service tag in action. As you can see we are using the Internet service tag to denote all inbound external traffic in this NSG rule.

Monitoring Azure Blobs with Azure Logic Apps

Azure Logic Apps is a very useful service that allows you to perform various enterprise integration tasks, process automation with on-premises and cloud tools a like. In this blog post I’m going to explain how we can leverage Azure Logic Apps to monitor Azure Blob containers. 

The Requirement

I have a blob container called fx-incoming-files that needs to periodically checke for any file that is older than 30 minutes. If any file that has been in the container for 30 minutes or more I need to alert the help desk.

Designing the Logic App

Following is the designer view of the Logic App that I have created. Let’s dissect each step in detail.

  • The recurrence action runs the logic app every 30 minutes.
  • Next step is to list all the blobs in the fx-incoming-files container. We are using List Blobs action for this.
  • First we need to check whether there are any files present in the container. Here is the condition to check that.
@greater(length(body('List_blobs_of_fx-incoming-files')?['value']), 0)
  • If the condition is met we need to check the condition of file age < 30 minutes. Here is the code for that condition. This check whether the last modified date/time for any blob is greater than 30 minutes than the current date/time.
@greater(addMinutes(utcNow(), -30), item()?['LastModified'])
  • Next action creates an HTML table from the body of the result of the previous step.
  • Next action is an Office 365 action which sends an email with the result from the previous step in the body of the email. Note that we need to select Is HTML to True so that the email would render properly with the HTML content.

As you can see all it needs is to figure out the correct connector and/or actions to use in your process flow. You can find the list of available connectors for Azure Logic Apps from this link.

Deploying Dependency Agent for Service Map with Azure VM Extension

OMS Service Map solution automatically discovers application components, processes and services on Windows and Linux systems and maps the communication between them. In order to use this solution, you need to deploy the Service Map Dependency Agent for the respective OS. As you can see in the below diagram, this agent does not transmit any data by itself, but rather transmits data to the OMS agent which will then publish the data to OMS. 

Image Courtesy Microsoft Docs

Microsoft has announced the release of a new Azure VM extension that will allow you to automatically deploy the dependency agent for Service Maps. Since Service Maps supports both Windows & Linux, there are two variants of this extension. One thing you do have to remember is that the dependency agent still depends on (ah the irony) the OMS agent and your Azure VMs need to have the OMS agent installed and configured prior deploying the dependency agent. 

Installing the dependency agent with Azure VM extension (PowerShell)

Following is a PowerShell code snippet (from Microsoft) that will allow you to install the Service Map on all VMs in an Azure resource group. 

$version = "9.1"

$ExtPublisher = "Microsoft.Azure.Monitoring.DependencyAgent"

$OsExtensionMap = @{ "Windows" = "DependencyAgentWindows"; "Linux" = "DependencyAgentLinux" }

$rmgroup = "<Your Resource Group Here>"

Get-AzureRmVM -ResourceGroupName $rmgroup |

ForEach-Object {

""

$name = $_.Name

$os = $_.StorageProfile.OsDisk.OsType

$location = $_.Location

$vmRmGroup = $_.ResourceGroupName

"${name}: ${os} (${location})"

Date -Format o

$ext = $OsExtensionMap.($os.ToString())

$result = Set-AzureRmVMExtension -ResourceGroupName $vmRmGroup -VMName $name -Location $location `

-Publisher $ExtPublisher -ExtensionType $ext -Name "DependencyAgent" -TypeHandlerVersion $version

$result.IsSuccessStatusCode

}
Installing the dependency agent with Azure VM extension (ARM Template)

However Microsoft didn’t publish any official reference on to deploy the Azure VM extension for deploying the dependency agent using ARM templates (as of yet). My colleague MVP Stanislav Zhelyazkov has published a blog post on how to do this with ARM templates. You can find his reference template from GitHub.

Note

This VM extension is currently available in the West US region and Microsoft is rolling it out all Azure regions in the next couple of days.

Closing the doors to SMB v1 in Azure VMs

Remember the nasty ransomware attacks Petya & WannaCry which spread almost like a zombie apocalypse earlier this year? Millions of devices arround the world were affected with these attacks which were designed to exploit the loopholes in the good old Server Message Block version 1 (SMB v1) protocol. A lot of security experts have argued that having SMB v1 enabled in servers/PCs by default will leave the consumers vulnerable for any future attacks of this nature.

Starting this month, Azure Security Team has closed the doors for SMB v1 protocol for Windows OS images available in Azure marketplace. This means that when you deploy a VM with any of the below operating systems using an Azure marketplace image, the SMV v1 protocol is disabled by default.

  • HPC Pack 2012 R2 Compute Node with Excel on Windows Server 2012 R2
  • HPC Pack 2012 R2 on Windows Server 2012 R2
  • Windows Server 2008 R2 SP1
  • Windows Server 2012 Datacenter
  • Windows Server 2012 R2 Datacenter
  • Windows Server 2016 – Nano Server
  • Windows Server 2016 Datacenter
  • Windows Server 2016 Datacenter – with Containers
  • [HUB] Windows Server 2008 R2 SP1
  • [HUB] Windows Server 2012 Datacenter
  • [HUB] Windows Server 2012 R2 Datacenter
  • [HUB] Windows Server 2016 Datacenter
  • [smalldisk] Windows Server 2008 R2 SP1
  • [smalldisk] Windows Server 2012 Datacenter
  • [smalldisk] Windows Server 2012 R2 Datacenter
  • [smalldisk] Windows Server 2016 Datacenter

This doesn’t mean that you can turn a blind eye to your existing Windows VMs in Azure. If you haven’t already disabled SMB v1 in those, you can refer this TechNet article to learn how to do so. Regardless of where your servers and PCs are deployed (cloud or on-premises) Microsoft strongly recommend you to disable SMB v1 protocol. 

OK what about Linux ?

The Samba service which enables the SMB protocol in Linux VMs is not installed by default in any Azure Linux gallery image. If you install this service later on once you have provisioned a VM vulnerability report CVE-2017-7494 need to be taken into consideration to there are any threats that you should be alarmed of. This explains the vulnerabilities in Samba 3.5 and onward where as the current version is  4.6.7. However it is always recommend to update to the latest version as soon as possible. 

Do I need to use SMB v1 ever ?

SMB v1 has been superseded by SMB v2 & V3 a long back. These versions are inherently more secure than the v1 of SMB protocol. However there are dozens of products out there which still leverages the SMB v1 protocol. This TechNet article lists out most of the products that still leverage SMB v1 at some point of their current life cycle.

 

 

Integrating OMS Service Map with SCOM

OMS Service Map solution is capable of automatically discovering dependencies of application components in Windows & Linux servers to map the communication flow between your business services. It maps connections between servers, processes, and ports across any TCP-connected server in your datacentre. With this solution you won’t have to configure anything besides installing an agent. Microsoft has recently released a public preview version of Service Map management pack which allows you to automatically create distributed application dashboards in SCOM based on the dynamic dependency maps generated in Service Map solution. In my opinion this is a very valuable integration as organizations that use SCOM as their main monitoring tool can leverage the dynamic application dependency monitoring capabilities of OMS, where as is past they had to rely on third party tools to visualize such. 

What is inside the Service Map MP ?

Like every other management pack you need to first import the Service Map MP into SCOM. When you import the Service Map MP (Microsoft.SystemCenter.ServiceMap.mpb) following dependent MPs will be installed in your SCOM management server/s. 

  • Microsoft Service Map Application Views
  • Microsoft System Center Service Map Internal
  • Microsoft System Center Service Map Overrides
  • Microsoft System Center Service Map

This management pack is compatible with both SCOM 2016 & 2012 R2 versions.

Known Limitations of the Public Preview

In the beginning of this post I have mentioned that this MP is still in preview and hence there are few issues and limitations with it as of now. I’m not sure whether Microsoft is going to address or change the behaviour some of these when the MP releases GA, specifically the limitations around updating the diagram views in SCOM console.

  • One management group can be integrated with only one OMS workspace.
  • Adding servers to the Service Map Servers Group manually won’t immediately sync those with service maps as they will be synced from Service Map during the next synchronization schedule. 
  • Making changes to the Distributed Application Diagrams created by this MP is not useful. Because these changes will be overwritten by the Service Maps solution in the next synchronization schedule.

If you are interested in trying out this new MP, following resources might come in handy.

File Recovery Error in Azure Backup

While trying to perform an in-place file restore in an Azure VM using Azure Backup, I have encountered an execution error. Azure Backup leverages a PowerShell script to mount the volumes of a Protected VM. In my case the following error was encountered when I executed the recovery script.

Microsoft Azure VM Backup - File Recovery
______________________________________________
Invoke-WebRequest : <HTML><HEAD><TITLE>Error Message</TITLE>
<META http-equiv=Content-Type content="text/html; charset=utf-8">
<BODY>
<TABLE><TR><TD id=L_dt_1><B>Network Access Message: The page cannot be displayed<B></TR></TABLE>
<TABLE><TR><TD height=15></TD></TR></TABLE>
<TABLE>
<TR><TD id=L_dt_2>Technical Information (for Support personnel)
<UL>
<LI id=L_dt_3>Error Code: 407 Proxy Authentication Required. Forefront TMG requires authorization to fulfill the
request. Access to the Web Proxy filter is denied. (12209)
<LI id=L_dt_4>IP Address: 10.31.8.16
<LI id=L_dt_5>Date: 8/8/2017 11:53:42 PM [GMT]
<LI id=L_dt_6>Server: XXXX.ab.abc.net
<LI id=L_dt_7>Source: proxy
</UL></TD></TR></TABLE></BODY></HTML>
At C:\Users\whewes_adm\Desktop\ILRPowershellScript.ps1:101 char:12
+ $output=Invoke-WebRequest -Uri "https://download.microsoft.com/download/E/1/4 ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebExc
 eption
 + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
Invoke-WebRequest : <HTML><HEAD><TITLE>Error Message</TITLE>
<META http-equiv=Content-Type content="text/html; charset=utf-8">
<BODY>
<TABLE><TR><TD id=L_dt_1><B>Network Access Message: The page cannot be displayed<B></TR></TABLE>
<TABLE><TR><TD height=15></TD></TR></TABLE>
<TABLE>
<TR><TD id=L_dt_2>Technical Information (for Support personnel)
<UL>
<LI id=L_dt_3>Error Code: 407 Proxy Authentication Required. Forefront TMG requires authorization to fulfill the
request. Access to the Web Proxy filter is denied. (12209)
<LI id=L_dt_4>IP Address: 10.31.8.16
<LI id=L_dt_5>Date: 8/8/2017 11:53:42 PM [GMT]
<LI id=L_dt_6>Server: XXXX.ab.abc.net
<LI id=L_dt_7>Source: proxy
</UL></TD></TR></TABLE></BODY></HTML>
At C:\Users\whewes_adm\Desktop\ILRPowershellScript.ps1:102 char:12
+ $output=Invoke-WebRequest -Uri "https://download.microsoft.com/download/E/1/4 ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebExc
 eption
 + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
Unable to access the recovery point. Please make sure that you have enabled access to Azure public IP addresses on the
outbound port 3260 and 'https://download.microsoft.com/'

One thing I noticed was it was complaining about outbound access to Azure public IP addresses on port 3260. The VMs were connected to on-premises environment via a dedicated ExpressRoute circuit so there were no issues with white listing Azure public IP addresses according to my knowledge. Also there were no NSGs controlling the traffic in the subnet where this VM was deployed.

I had a look on another server that is running in a VMware cluster on-premises and noticed that there is a HTTP proxy present in the environment. Once I have added the proxy settings in the VM , I could execute the recovery script without any hassle. 

The article “Prepare your environment to back up Azure virtual machines” published in the Microsoft documentation, explains the required network configuration for Azure Backup in case your environment has policies governing outbound Internet connectivity. Therefore I recommend you to have a look on that first before planning your Azure Backup deployment to protect Azure VMs.

Domain Join Error | JsonADDomainExtension in ARM

Recently I have been working on an ARM template to create a Windows Server 2012 R2 VM from a managed disk image and join it to a Windows domain. I used a VM extension called JsonADDomainExtension to perform the domain join task. However my first 3 attempts were in vain as the VM was not added to the domain and I see an error in the extension deployment.

I examined the ADDomainExtension log file which is available at C:\WindowsAzure\Logs\Plugins\Microsoft.Compute.JsonADDomainExtension\1.0\ADDomainExtension.log and noticed below error.

2017-07-31T05:04:47.0833850Z [Info]: Joining Domain 'child.abc.net'

2017-07-31T05:04:47.0833850Z [Info]: Joining Domain 'child.abc.net'

2017-07-31T05:04:47.0833850Z [Info]: Get Domain/Workgroup Information

2017-07-31T05:04:48.0521988Z [Info]: Current domain:  (), current workgroup: WORKGROUP, IsDomainJoin: True, Target Domain/Workgroup: child.abc.net.

2017-07-31T05:04:48.0521988Z [Info]: Domain Join Path.

2017-07-31T05:04:48.0521988Z [Info]: Current Domain 
name is empty/null. Try to get Local domain name.

2017-07-31T05:04:48.0521988Z [Info]: In AD Domain extension process, the local domain is: ''.

2017-07-31T05:04:48.0521988Z [Info]: Domain Join will be performed.

2017-07-31T05:05:06.1756824Z [Error]: Try join: domain='child.abc.net', ou='OU=Test Objects,DC=child,DC=abc,DC=net', user='abc\SVC_Azure_Srv_Joindom', option='NetSetupJoinDomain, NetSetupAcctCreate' (#3:User Specified), errCode='1326'.

2017-07-31T05:05:15.4067523Z [Error]: Try join: domain='child.abc.net', ou='OU=Test Objects,DC=child,DC=abc,DC=net', user='abc\SVC_Azure_Srv_Joindom', option='NetSetupJoinDomain' (#1:User Specified without NetSetupAcctCreate), errCode='1326'.

2017-07-31T05:05:15.4223371Z [Error]: Computer failed to join domain 'child.abc.net' from workgroup 'WORKGROUP'.

2017-07-31T05:05:15.4223371Z [Info]: Retrying action after 3 seconds, at attempt 1 out of '3'.

The NetSetup.log available at %windir%\debug\netsetup.log reports below error.

07/31/2017 05:05:14:253 NetpProvisionComputerAccount:

07/31/2017 05:05:14:253 NetpProvisionComputerAccount:

07/31/2017 05:05:14:253 lpDomain: child.abc.net

07/31/2017 05:05:14:253 lpHostName: AUETARMVM01

07/31/2017 05:05:14:253 lpMachineAccountOU: OU=Test Objects,DC=child,DC=abc,DC=net

07/31/2017 05:05:14:253 lpDcName: mydc01.child.abc.net

07/31/2017 05:05:14:253 lpMachinePassword: (null)

07/31/2017 05:05:14:253 lpAccount: orica\SVC_Azure_Srv_Joindom

07/31/2017 05:05:14:253 lpPassword: (non-null)

07/31/2017 05:05:14:253 dwJoinOptions: 0x1

07/31/2017 05:05:14:253 dwOptions: 0x40000003

07/31/2017 05:05:15:406 NetpLdapBind: ldap_bind failed on mydc01.child.abc.net: 49: Invalid Credentials

07/31/2017 05:05:15:406 NetpJoinCreatePackagePart: status:0x52e.
07/31/2017 05:05:15:406 NetpAddProvisioningPackagePart: status:0x52e.
07/31/2017 05:05:15:406 NetpJoinDomainOnDs: Function exits with status of: 0x52e
07/31/2017 05:05:15:406 NetpJoinDomainOnDs: status of disconnecting from '\\mydc01.child.abc.net': 0x0
07/31/2017 05:05:15:406 NetpResetIDNEncoding: DnsDisableIdnEncoding(RESETALL) on 'child.abc.net' returned 0x0
07/31/2017 05:05:15:406 NetpJoinDomainOnDs: NetpResetIDNEncoding on 'child.abc.net': 0x0
07/31/2017 05:05:15:406 NetpDoDomainJoin: status: 0x52e
07/31/2017 05:05:18:432 -----------------------------------------------------------------

The issue was obvious after that. The service account used for domain join was incorrect. It should have been corrected as child.abc.net\SVC_Azure_Srv_Joindom Once this was corrected I was able to re-deploy the arm template without any issue and the join domain operation was successful.

If you want know more about how to leverage the “JsonADDomainExtension” in your ARM template, following article provides an excellent overview.

Azure ARM: VM Domain Join to Active Directory Domain with “JoinDomain” Extension

 

 

Data corruption issue with NTFS sparse files in Windows Server 2016

Microsoft has released a new patch KB4025334 which prevents a critical data corruption issue with NTFS sparse files in Windows Server 2016.  This patch will prevent possible data corruptions that could occur when using Data Deduplication in Windows Server 2016. However this update also a remedy prevent this issue in all applications and Windows components that leverage sparse files on NTFS in Windows Server 2016.

Although this is an optional update, Microsoft recommends to install this KB to avoid any corruptions in Data deduplication although this KB doesn’t provide  a way to recover from existing data corruptions. The reason being is that NTFS incorrectly removes in-use clusters from the file and there is no way to identify what clusters were incorrectly removed afterwards. Furthermore this update will become a mandatory patch in the “Patch Tuesday” release cycle in August 2017.

Since this issue is hard to notice, you won’t be able detected that by monitoring the weekly Dedup integrity scrubbing job. To overcome this challenge this KB also includes an update to chkdsk which will allow you to identify which files are already corrupted.

Identifying corrupted NTFS sparse files with chkdsk in KB4025334

  • First, install KB4025334 on affected servers and restart same. Keep in mind that if your servers are in a failover cluster this patch needs to be applied for all the servers in your cluster.
  • Execute chkdsk in read-only mode which is the default mode for chkdsk.
  • For any possibly corrupted files, chkdsk will provide an output similar to below. Here 20000000000f3 is the file id and make a note of all the file ids of the output.
The total allocated size in attribute record (128, "") of file 20000000000f3 is incorrect.
  • Then you can use fsutil to query the corrupted files by their ids as per below example.
D:\afftectedfolder> fsutil file queryfilenamebyid D:\ 0x20000000000f3
  • Once you run above command, you should get a similar output like below. D:/affectedfolder/TEST.0 is the corrupted file in this case.
A random link name to this file is [file://%3f/D:/affectedfolder/TEST.0]\\?\D:\affectedfolder\TEST.0

Instant File Recovery with Azure Backup

Microsoft Azure now allows file and folder level recovery with Azure Backup. This feature has been available as a preview for both Windows & Linux VMs and is now generally available. Previously Azure Backup allowed VM level recovery and for those who wanted to leverage file level recovery had to leverage solutions like DPM or Azure Backup to achieve their protection goals.

Restore-as-a-Service in Azure Backup allow syou to recovery files or folders instantaneously without deploying any other additional components. Azure Backup leverages an iSCSI-based approach to open/mount application files directly from its’ recovery points. This eliminates the need to restore the entire VM to recover files. 

RaaS in Action

In below example I’m going to explain how you can recover files or folders from a Windows IaaS VM in Azure.

  • Select the VM that you want to recover files from, in the recovery services vault under Backup items. Click the File Recovery option.

  • Select the recovery point as in (1) and download the executable that allows you to browse and recovery files as in (2).  Run this as an administrator and you will have to provide the password as in (3) to execute this file.

  • This script can be executed on any machine that has the same (or compatible) operating system as the backed-up VM. Unless the the protected Azure VM uses Windows Storage Spaces (for Windows Azure VMs) or LVM/RAID Arrays(for Linux VMs), you can run the executable/script on the same VM. If they do, run it on any other machine with a compatible operating system.
Server OS Compatible client OS
Windows Server 2012 R2 Windows 8.1
Windows Server 2012 Windows 8
Windows Server 2008 R2 Windows 7
Ubuntu 12.04 and above
CentOS 6.5 and above
RHEL 6.7 and above
Debian 7 and above
Oracle Linux 6.4 and above

If you are restoring from a Linux VM you need bash version 4 or above and python version 2.6.6 and above to execute the script. 

  • When you run the script the volumes are mounted in the client OS that you are using and will have different drive letters that the ones from the original VM. Make sure you identify the new drives attached.  You can view your new drives in Windows Explorer and copy them to an alternate location.

  • Finally after restoring your files/folders to unmount the drives, Click the File Recovery blade in the Azure portal and select Unmount Disks as in (4).