All posts by Janaka Rangama

Deploying Dependency Agent for Service Map with Azure VM Extension

OMS Service Map solution automatically discovers application components, processes and services on Windows and Linux systems and maps the communication between them. In order to use this solution, you need to deploy the Service Map Dependency Agent for the respective OS. As you can see in the below diagram, this agent does not transmit any data by itself, but rather transmits data to the OMS agent which will then publish the data to OMS. 

Image Courtesy Microsoft Docs

Microsoft has announced the release of a new Azure VM extension that will allow you to automatically deploy the dependency agent for Service Maps. Since Service Maps supports both Windows & Linux, there are two variants of this extension. One thing you do have to remember is that the dependency agent still depends on (ah the irony) the OMS agent and your Azure VMs need to have the OMS agent installed and configured prior deploying the dependency agent. 

Installing the dependency agent with Azure VM extension (PowerShell)

Following is a PowerShell code snippet (from Microsoft) that will allow you to install the Service Map on all VMs in an Azure resource group. 

$version = "9.1"

$ExtPublisher = "Microsoft.Azure.Monitoring.DependencyAgent"

$OsExtensionMap = @{ "Windows" = "DependencyAgentWindows"; "Linux" = "DependencyAgentLinux" }

$rmgroup = "<Your Resource Group Here>"

Get-AzureRmVM -ResourceGroupName $rmgroup |

ForEach-Object {

""

$name = $_.Name

$os = $_.StorageProfile.OsDisk.OsType

$location = $_.Location

$vmRmGroup = $_.ResourceGroupName

"${name}: ${os} (${location})"

Date -Format o

$ext = $OsExtensionMap.($os.ToString())

$result = Set-AzureRmVMExtension -ResourceGroupName $vmRmGroup -VMName $name -Location $location `

-Publisher $ExtPublisher -ExtensionType $ext -Name "DependencyAgent" -TypeHandlerVersion $version

$result.IsSuccessStatusCode

}
Installing the dependency agent with Azure VM extension (ARM Template)

However Microsoft didn’t publish any official reference on to deploy the Azure VM extension for deploying the dependency agent using ARM templates (as of yet). My colleague MVP Stanislav Zhelyazkov has published a blog post on how to do this with ARM templates. You can find his reference template from GitHub.

Note

This VM extension is currently available in the West US region and Microsoft is rolling it out all Azure regions in the next couple of days.

Closing the doors to SMB v1 in Azure VMs

Remember the nasty ransomware attacks Petya & WannaCry which spread almost like a zombie apocalypse earlier this year? Millions of devices arround the world were affected with these attacks which were designed to exploit the loopholes in the good old Server Message Block version 1 (SMB v1) protocol. A lot of security experts have argued that having SMB v1 enabled in servers/PCs by default will leave the consumers vulnerable for any future attacks of this nature.

Starting this month, Azure Security Team has closed the doors for SMB v1 protocol for Windows OS images available in Azure marketplace. This means that when you deploy a VM with any of the below operating systems using an Azure marketplace image, the SMV v1 protocol is disabled by default.

  • HPC Pack 2012 R2 Compute Node with Excel on Windows Server 2012 R2
  • HPC Pack 2012 R2 on Windows Server 2012 R2
  • Windows Server 2008 R2 SP1
  • Windows Server 2012 Datacenter
  • Windows Server 2012 R2 Datacenter
  • Windows Server 2016 – Nano Server
  • Windows Server 2016 Datacenter
  • Windows Server 2016 Datacenter – with Containers
  • [HUB] Windows Server 2008 R2 SP1
  • [HUB] Windows Server 2012 Datacenter
  • [HUB] Windows Server 2012 R2 Datacenter
  • [HUB] Windows Server 2016 Datacenter
  • [smalldisk] Windows Server 2008 R2 SP1
  • [smalldisk] Windows Server 2012 Datacenter
  • [smalldisk] Windows Server 2012 R2 Datacenter
  • [smalldisk] Windows Server 2016 Datacenter

This doesn’t mean that you can turn a blind eye to your existing Windows VMs in Azure. If you haven’t already disabled SMB v1 in those, you can refer this TechNet article to learn how to do so. Regardless of where your servers and PCs are deployed (cloud or on-premises) Microsoft strongly recommend you to disable SMB v1 protocol. 

OK what about Linux ?

The Samba service which enables the SMB protocol in Linux VMs is not installed by default in any Azure Linux gallery image. If you install this service later on once you have provisioned a VM vulnerability report CVE-2017-7494 need to be taken into consideration to there are any threats that you should be alarmed of. This explains the vulnerabilities in Samba 3.5 and onward where as the current version is  4.6.7. However it is always recommend to update to the latest version as soon as possible. 

Do I need to use SMB v1 ever ?

SMB v1 has been superseded by SMB v2 & V3 a long back. These versions are inherently more secure than the v1 of SMB protocol. However there are dozens of products out there which still leverages the SMB v1 protocol. This TechNet article lists out most of the products that still leverage SMB v1 at some point of their current life cycle.

 

 

Integrating OMS Service Map with SCOM

OMS Service Map solution is capable of automatically discovering dependencies of application components in Windows & Linux servers to map the communication flow between your business services. It maps connections between servers, processes, and ports across any TCP-connected server in your datacentre. With this solution you won’t have to configure anything besides installing an agent. Microsoft has recently released a public preview version of Service Map management pack which allows you to automatically create distributed application dashboards in SCOM based on the dynamic dependency maps generated in Service Map solution. In my opinion this is a very valuable integration as organizations that use SCOM as their main monitoring tool can leverage the dynamic application dependency monitoring capabilities of OMS, where as is past they had to rely on third party tools to visualize such. 

What is inside the Service Map MP ?

Like every other management pack you need to first import the Service Map MP into SCOM. When you import the Service Map MP (Microsoft.SystemCenter.ServiceMap.mpb) following dependent MPs will be installed in your SCOM management server/s. 

  • Microsoft Service Map Application Views
  • Microsoft System Center Service Map Internal
  • Microsoft System Center Service Map Overrides
  • Microsoft System Center Service Map

This management pack is compatible with both SCOM 2016 & 2012 R2 versions.

Known Limitations of the Public Preview

In the beginning of this post I have mentioned that this MP is still in preview and hence there are few issues and limitations with it as of now. I’m not sure whether Microsoft is going to address or change the behaviour some of these when the MP releases GA, specifically the limitations around updating the diagram views in SCOM console.

  • One management group can be integrated with only one OMS workspace.
  • Adding servers to the Service Map Servers Group manually won’t immediately sync those with service maps as they will be synced from Service Map during the next synchronization schedule. 
  • Making changes to the Distributed Application Diagrams created by this MP is not useful. Because these changes will be overwritten by the Service Maps solution in the next synchronization schedule.

If you are interested in trying out this new MP, following resources might come in handy.

File Recovery Error in Azure Backup

While trying to perform an in-place file restore in an Azure VM using Azure Backup, I have encountered an execution error. Azure Backup leverages a PowerShell script to mount the volumes of a Protected VM. In my case the following error was encountered when I executed the recovery script.

Microsoft Azure VM Backup - File Recovery
______________________________________________
Invoke-WebRequest : <HTML><HEAD><TITLE>Error Message</TITLE>
<META http-equiv=Content-Type content="text/html; charset=utf-8">
<BODY>
<TABLE><TR><TD id=L_dt_1><B>Network Access Message: The page cannot be displayed<B></TR></TABLE>
<TABLE><TR><TD height=15></TD></TR></TABLE>
<TABLE>
<TR><TD id=L_dt_2>Technical Information (for Support personnel)
<UL>
<LI id=L_dt_3>Error Code: 407 Proxy Authentication Required. Forefront TMG requires authorization to fulfill the
request. Access to the Web Proxy filter is denied. (12209)
<LI id=L_dt_4>IP Address: 10.31.8.16
<LI id=L_dt_5>Date: 8/8/2017 11:53:42 PM [GMT]
<LI id=L_dt_6>Server: XXXX.ab.abc.net
<LI id=L_dt_7>Source: proxy
</UL></TD></TR></TABLE></BODY></HTML>
At C:\Users\whewes_adm\Desktop\ILRPowershellScript.ps1:101 char:12
+ $output=Invoke-WebRequest -Uri "https://download.microsoft.com/download/E/1/4 ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebExc
 eption
 + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
Invoke-WebRequest : <HTML><HEAD><TITLE>Error Message</TITLE>
<META http-equiv=Content-Type content="text/html; charset=utf-8">
<BODY>
<TABLE><TR><TD id=L_dt_1><B>Network Access Message: The page cannot be displayed<B></TR></TABLE>
<TABLE><TR><TD height=15></TD></TR></TABLE>
<TABLE>
<TR><TD id=L_dt_2>Technical Information (for Support personnel)
<UL>
<LI id=L_dt_3>Error Code: 407 Proxy Authentication Required. Forefront TMG requires authorization to fulfill the
request. Access to the Web Proxy filter is denied. (12209)
<LI id=L_dt_4>IP Address: 10.31.8.16
<LI id=L_dt_5>Date: 8/8/2017 11:53:42 PM [GMT]
<LI id=L_dt_6>Server: XXXX.ab.abc.net
<LI id=L_dt_7>Source: proxy
</UL></TD></TR></TABLE></BODY></HTML>
At C:\Users\whewes_adm\Desktop\ILRPowershellScript.ps1:102 char:12
+ $output=Invoke-WebRequest -Uri "https://download.microsoft.com/download/E/1/4 ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebExc
 eption
 + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
Unable to access the recovery point. Please make sure that you have enabled access to Azure public IP addresses on the
outbound port 3260 and 'https://download.microsoft.com/'

One thing I noticed was it was complaining about outbound access to Azure public IP addresses on port 3260. The VMs were connected to on-premises environment via a dedicated ExpressRoute circuit so there were no issues with white listing Azure public IP addresses according to my knowledge. Also there were no NSGs controlling the traffic in the subnet where this VM was deployed.

I had a look on another server that is running in a VMware cluster on-premises and noticed that there is a HTTP proxy present in the environment. Once I have added the proxy settings in the VM , I could execute the recovery script without any hassle. 

The article “Prepare your environment to back up Azure virtual machines” published in the Microsoft documentation, explains the required network configuration for Azure Backup in case your environment has policies governing outbound Internet connectivity. Therefore I recommend you to have a look on that first before planning your Azure Backup deployment to protect Azure VMs.

Domain Join Error | JsonADDomainExtension in ARM

Recently I have been working on an ARM template to create a Windows Server 2012 R2 VM from a managed disk image and join it to a Windows domain. I used a VM extension called JsonADDomainExtension to perform the domain join task. However my first 3 attempts were in vain as the VM was not added to the domain and I see an error in the extension deployment.

I examined the ADDomainExtension log file which is available at C:\WindowsAzure\Logs\Plugins\Microsoft.Compute.JsonADDomainExtension\1.0\ADDomainExtension.log and noticed below error.

2017-07-31T05:04:47.0833850Z [Info]: Joining Domain 'child.abc.net'

2017-07-31T05:04:47.0833850Z [Info]: Joining Domain 'child.abc.net'

2017-07-31T05:04:47.0833850Z [Info]: Get Domain/Workgroup Information

2017-07-31T05:04:48.0521988Z [Info]: Current domain:  (), current workgroup: WORKGROUP, IsDomainJoin: True, Target Domain/Workgroup: child.abc.net.

2017-07-31T05:04:48.0521988Z [Info]: Domain Join Path.

2017-07-31T05:04:48.0521988Z [Info]: Current Domain 
name is empty/null. Try to get Local domain name.

2017-07-31T05:04:48.0521988Z [Info]: In AD Domain extension process, the local domain is: ''.

2017-07-31T05:04:48.0521988Z [Info]: Domain Join will be performed.

2017-07-31T05:05:06.1756824Z [Error]: Try join: domain='child.abc.net', ou='OU=Test Objects,DC=child,DC=abc,DC=net', user='abc\SVC_Azure_Srv_Joindom', option='NetSetupJoinDomain, NetSetupAcctCreate' (#3:User Specified), errCode='1326'.

2017-07-31T05:05:15.4067523Z [Error]: Try join: domain='child.abc.net', ou='OU=Test Objects,DC=child,DC=abc,DC=net', user='abc\SVC_Azure_Srv_Joindom', option='NetSetupJoinDomain' (#1:User Specified without NetSetupAcctCreate), errCode='1326'.

2017-07-31T05:05:15.4223371Z [Error]: Computer failed to join domain 'child.abc.net' from workgroup 'WORKGROUP'.

2017-07-31T05:05:15.4223371Z [Info]: Retrying action after 3 seconds, at attempt 1 out of '3'.

The NetSetup.log available at %windir%\debug\netsetup.log reports below error.

07/31/2017 05:05:14:253 NetpProvisionComputerAccount:

07/31/2017 05:05:14:253 NetpProvisionComputerAccount:

07/31/2017 05:05:14:253 lpDomain: child.abc.net

07/31/2017 05:05:14:253 lpHostName: AUETARMVM01

07/31/2017 05:05:14:253 lpMachineAccountOU: OU=Test Objects,DC=child,DC=abc,DC=net

07/31/2017 05:05:14:253 lpDcName: mydc01.child.abc.net

07/31/2017 05:05:14:253 lpMachinePassword: (null)

07/31/2017 05:05:14:253 lpAccount: orica\SVC_Azure_Srv_Joindom

07/31/2017 05:05:14:253 lpPassword: (non-null)

07/31/2017 05:05:14:253 dwJoinOptions: 0x1

07/31/2017 05:05:14:253 dwOptions: 0x40000003

07/31/2017 05:05:15:406 NetpLdapBind: ldap_bind failed on mydc01.child.abc.net: 49: Invalid Credentials

07/31/2017 05:05:15:406 NetpJoinCreatePackagePart: status:0x52e.
07/31/2017 05:05:15:406 NetpAddProvisioningPackagePart: status:0x52e.
07/31/2017 05:05:15:406 NetpJoinDomainOnDs: Function exits with status of: 0x52e
07/31/2017 05:05:15:406 NetpJoinDomainOnDs: status of disconnecting from '\\mydc01.child.abc.net': 0x0
07/31/2017 05:05:15:406 NetpResetIDNEncoding: DnsDisableIdnEncoding(RESETALL) on 'child.abc.net' returned 0x0
07/31/2017 05:05:15:406 NetpJoinDomainOnDs: NetpResetIDNEncoding on 'child.abc.net': 0x0
07/31/2017 05:05:15:406 NetpDoDomainJoin: status: 0x52e
07/31/2017 05:05:18:432 -----------------------------------------------------------------

The issue was obvious after that. The service account used for domain join was incorrect. It should have been corrected as child.abc.net\SVC_Azure_Srv_Joindom Once this was corrected I was able to re-deploy the arm template without any issue and the join domain operation was successful.

If you want know more about how to leverage the “JsonADDomainExtension” in your ARM template, following article provides an excellent overview.

Azure ARM: VM Domain Join to Active Directory Domain with “JoinDomain” Extension

 

 

Data corruption issue with NTFS sparse files in Windows Server 2016

Microsoft has released a new patch KB4025334 which prevents a critical data corruption issue with NTFS sparse files in Windows Server 2016.  This patch will prevent possible data corruptions that could occur when using Data Deduplication in Windows Server 2016. However this update also a remedy prevent this issue in all applications and Windows components that leverage sparse files on NTFS in Windows Server 2016.

Although this is an optional update, Microsoft recommends to install this KB to avoid any corruptions in Data deduplication although this KB doesn’t provide  a way to recover from existing data corruptions. The reason being is that NTFS incorrectly removes in-use clusters from the file and there is no way to identify what clusters were incorrectly removed afterwards. Furthermore this update will become a mandatory patch in the “Patch Tuesday” release cycle in August 2017.

Since this issue is hard to notice, you won’t be able detected that by monitoring the weekly Dedup integrity scrubbing job. To overcome this challenge this KB also includes an update to chkdsk which will allow you to identify which files are already corrupted.

Identifying corrupted NTFS sparse files with chkdsk in KB4025334

  • First, install KB4025334 on affected servers and restart same. Keep in mind that if your servers are in a failover cluster this patch needs to be applied for all the servers in your cluster.
  • Execute chkdsk in read-only mode which is the default mode for chkdsk.
  • For any possibly corrupted files, chkdsk will provide an output similar to below. Here 20000000000f3 is the file id and make a note of all the file ids of the output.
The total allocated size in attribute record (128, "") of file 20000000000f3 is incorrect.
  • Then you can use fsutil to query the corrupted files by their ids as per below example.
D:\afftectedfolder> fsutil file queryfilenamebyid D:\ 0x20000000000f3
  • Once you run above command, you should get a similar output like below. D:/affectedfolder/TEST.0 is the corrupted file in this case.
A random link name to this file is [file://%3f/D:/affectedfolder/TEST.0]\\?\D:\affectedfolder\TEST.0

Instant File Recovery with Azure Backup

Microsoft Azure now allows file and folder level recovery with Azure Backup. This feature has been available as a preview for both Windows & Linux VMs and is now generally available. Previously Azure Backup allowed VM level recovery and for those who wanted to leverage file level recovery had to leverage solutions like DPM or Azure Backup to achieve their protection goals.

Restore-as-a-Service in Azure Backup allow syou to recovery files or folders instantaneously without deploying any other additional components. Azure Backup leverages an iSCSI-based approach to open/mount application files directly from its’ recovery points. This eliminates the need to restore the entire VM to recover files. 

RaaS in Action

In below example I’m going to explain how you can recover files or folders from a Windows IaaS VM in Azure.

  • Select the VM that you want to recover files from, in the recovery services vault under Backup items. Click the File Recovery option.

  • Select the recovery point as in (1) and download the executable that allows you to browse and recovery files as in (2).  Run this as an administrator and you will have to provide the password as in (3) to execute this file.

  • This script can be executed on any machine that has the same (or compatible) operating system as the backed-up VM. Unless the the protected Azure VM uses Windows Storage Spaces (for Windows Azure VMs) or LVM/RAID Arrays(for Linux VMs), you can run the executable/script on the same VM. If they do, run it on any other machine with a compatible operating system.
Server OS Compatible client OS
Windows Server 2012 R2 Windows 8.1
Windows Server 2012 Windows 8
Windows Server 2008 R2 Windows 7
Ubuntu 12.04 and above
CentOS 6.5 and above
RHEL 6.7 and above
Debian 7 and above
Oracle Linux 6.4 and above

If you are restoring from a Linux VM you need bash version 4 or above and python version 2.6.6 and above to execute the script. 

  • When you run the script the volumes are mounted in the client OS that you are using and will have different drive letters that the ones from the original VM. Make sure you identify the new drives attached.  You can view your new drives in Windows Explorer and copy them to an alternate location.

  • Finally after restoring your files/folders to unmount the drives, Click the File Recovery blade in the Azure portal and select Unmount Disks as in (4).

 

VM Inception | Nested Virtualization in Azure

I bet that most of you have watched the movie “Inception”, where a group of people are building a dream within a dream within a dream. Before Windows Server 2016 you couldn’t deploy a VM within a VM in Hyper-V. Lot of people are/were encouraged to use VMware as it supported this capability called “Nested Virtualization”. But with the release of Windows Server 2016 & Hyper-V server 2016 this functionality has been introduced. This is specially useful when you don’t have lot of hardware to run your lab environments or want to deploy a PoC system without burning thousands of dollars.

Microsoft announced the support for nested virtualization Azure IaaS VMs using the newly announced  Dv3 and Ev3 VM sizes. This capability allows you to create nested VMs in an Azure VM and also run Hyper-V containers in Azure by using nested VM hosts. Now let’s have look on how this is implemented in Azure Azure Compute fabric.

Image Courtesy Build 2017

As you can see in the above diagram, on top of the Azure hardware layer, Microsoft has deployed the Windows Server 2016 Hyper-V hypervisor. Microsoft then adds vCPU on top of that to expose the Azure IaaS VMs that you would normally get. With nested virtualization, you can enable Hyper-V inside those Azure IaaS VMs running Windows Server 2016. You can then run any number of  Hyper-V 2016 supported guest operating systems inside these nested VM hosts.

Following references from MSFT provides more information on how you can get started with nested virtualization in Azure.

 

 

 

 

 

Azure Site Recovery Updates | Support for Large Disks

Microsoft Azure recently announced the support for large disks up to 4 TB. Now Azure Site Recovery supports protecting on-premises VMs and physical servers with disks up to 4095 GB in size to Azure. Many customers use disks with more than 1 TB in capacity for various reasons. A good example would be SQL databases and file servers. The availability of large disks in Azure allows you to leverage ASR as a DR solution for your datacenter infrastructure. 

Large disks in Azure are available both in standard and premium tiers. Standard disks offer two sizes  S40 (2TB) and S50 (4TB) for both managed and unmanaged disks. If you have IO intensive workloads that require premium storage you can use P40 (2TB) and P50 (4TB)  for both managed and unmanaged disks.

Pre-requisites for protecting VMs with large disks in ASR

You need to make sure that your on-premises ASR infrastructure components are up-to-date before you  you start protecting VMs and/or physical servers with disks greater than 1 TB in size. 

VMware/Physical Servers  Install the latest update on the Configuration server, additional process servers, additional master target servers and agents.
SCVMM managed Hyper-V environments Install the latest Microsoft Azure Site Recovery Provider update on the on-premises VMM server.
Standalone Hyper-V servers not managed by SCVMM Install the latest Microsoft Azure Site Recovery Provider on each Hyper-V server that is registered with Azure Site Recovery.

Note that protecting Azure VMs with large disks is not a currently supported scenario. 

Azure Site Recovery Updates | Storage Spaces & Windows Server 2016

Microsoft has recently announced a preview for protecting Azure IaaS VMs with ASR. Now you can protect Azure VMs running Windows Server 2016 . Also ASR now supports protecting Azure IaaS VMs with Storage Spaces. Storage Spaces allow you to  improve IO performance by striping disks and to create logical disks larger than 4 TB. 

Following is a list of all supported OS versions that can be protected using ASR.

Windows
  • Windows Server 2016 (Server Core and Server with Desktop Experience)
  • Windows Server 2012 R2
  • Windows Server 2012
  • Windows Server 2008 R2 SP1 and above
Linux
  • Red Hat Enterprise Linux 6.7, 6.8, 7.0, 7.1, 7.2, 7.3
  • CentOS 6.5, 6.6, 6.7, 6.8, 7.0, 7.1, 7.2, 7.3
  • Ubuntu 14.04/16.04 LTS Server (only supported kernel versions)
  • SUSE Linux Enterprise Server 11 SP3
  • Oracle Enterprise Linux 6.4, 6.5 running either the Red Hat compatible kernel or Unbreakable Enterprise Kernel Release 3 (UEK3)