Category Archives: Azure

VM Inception | Nested Virtualization in Azure

I bet that most of you have watched the movie “Inception”, where a group of people are building a dream within a dream within a dream. Before Windows Server 2016 you couldn’t deploy a VM within a VM in Hyper-V. Lot of people are/were encouraged to use VMware as it supported this capability called “Nested Virtualization”. But with the release of Windows Server 2016 & Hyper-V server 2016 this functionality has been introduced. This is specially useful when you don’t have lot of hardware to run your lab environments or want to deploy a PoC system without burning thousands of dollars.

Microsoft announced the support for nested virtualization Azure IaaS VMs using the newly announced  Dv3 and Ev3 VM sizes. This capability allows you to create nested VMs in an Azure VM and also run Hyper-V containers in Azure by using nested VM hosts. Now let’s have look on how this is implemented in Azure Azure Compute fabric.

Image Courtesy Build 2017

As you can see in the above diagram, on top of the Azure hardware layer, Microsoft has deployed the Windows Server 2016 Hyper-V hypervisor. Microsoft then adds vCPU on top of that to expose the Azure IaaS VMs that you would normally get. With nested virtualization, you can enable Hyper-V inside those Azure IaaS VMs running Windows Server 2016. You can then run any number of  Hyper-V 2016 supported guest operating systems inside these nested VM hosts.

Following references from MSFT provides more information on how you can get started with nested virtualization in Azure.

 

 

 

 

 

502 Bad Gateway error | Azure Application Gateway Troubleshooting

I was setting up an Azure Application Gateway for a project couple days back. The intended workload was setting up git on nginix.  But when I tried to reach the git URL I noticed that it was failing with 502 Bad Gateway error.

Initial Troubleshooting Steps

  • We tried to access backend servers from Application Gateway IP 10.62.124.6, backend server IPs are 10.62.66.4 and 10.62.66.4. Application Gateway configured for SSL Offload.
  • We were able to access the backend servers directly on port 80, but when accessed via Application Gateway this issue occurs.
  • Rebooted the Application Gateway and the backend servers. and configured custom probe as well. But the issue was with the Request time out value which is by default configured for 30 seconds.
  • This means that when user request is received on Application Gateway, it forwards it to the backend pool and waits for 30 seconds and if it fails to get a response back then within that period users will receive a 502 error.
  • The issue has been temporarily  resolved after the time out period on the Backend HTTP settings has been changed to 120 seconds.

Real Deal

Increasing the timeout values were only a temporary fix as we were unable to find out a permanent fix. I have reached out to Microsoft Support and they wanted us to run below diagnostics.

  • Execute the below cmdlet and share the results.

$getgw = Get-AzureRmApplicationGateway -Name <application gateway name> -ResourceGroupName <resource group name>

  • Collect simultaneous Network traces:
    1. Start network captures on On-premises machine(Source client machine) and Azure VMs (Backend servers)
      • Windows: Netsh
        1. Command to start the trace:  Netsh trace start capture=yes report=yes maxsize=4096 tracefile=C:\Nettrace.etl
        2. Command to stop the trace:  Netsh trace stop 
      • Linux: TCPdump
        1. TCP DUMP command: sudo tcpdump -i eth0 -s 0 -X -w vmtrace.cap 
    2. Reproduce the behavior.
    3. Stop network captures.

Analysis

The network traces collected on Client machine and Destination servers while the issue was reproduced indicates that,  during the time period the trace was collected, for every HTTP Get Request (default probing) from the Application Gateway instances on the backend servers, the servers responded “Status: Forbidden” HTTP Response.

This has resulted in Application Gateway marking the backend servers as unhealthy as the expected response is HTTP 200 OK.
 
The Application Gateway “gitapp” configured for 2 instances (Internal instance IPs: 10.62.124.4, 10.62.124.5)
 
Trace collected on backend server 10.62.66.4
 
12:50:18 PM 1/3/2017    10.62.124.4         10.62.66.4            HTTP      HTTP:Request, GET /
12:50:18 PM 1/3/2017    10.62.66.4            10.62.124.4         HTTP      HTTP:Response, HTTP/1.1, Status: Forbidden, URL: /
12:50:45 PM 1/3/2017    10.62.124.5         10.62.66.4            HTTP      HTTP:Request, GET /
12:50:45 PM 1/3/2017    10.62.66.4            10.62.124.5         HTTP      HTTP:Response, HTTP/1.1, Status: Forbidden, URL: /
 
Trace collected on backend server 10.62.66.5
 
12:50:48 PM 1/3/2017    10.62.124.4         10.62.66.5            HTTP      HTTP:Request, GET /
12:50:48 PM 1/3/2017    10.62.66.5            10.62.124.4         HTTP      HTTP:Response, HTTP/1.1, Status: Forbidden, URL: /
12:50:45 PM 1/3/2017    10.62.124.5         10.62.66.5            HTTP      HTTP:Request, GET /
12:50:45 PM 1/3/2017    10.62.66.5            10.62.124.5         HTTP      HTTP:Response, HTTP/1.1, Status: Forbidden, URL: /

Rootcause

Due to the security feature ‘rack_attack’ enabled on the backend servers, it has blacklisted the application gateway instance IP’s and therefore the servers were not responding to the Application Gateway, causing it to mark the backend servers as Unhealthy.

Fix

Once this feature was disabled on the backend web servers (niginx) , the issue was resolved and we could successfully access the web application using Application Gateway.

 

Storage Spaces Direct | Deploying S2D in Azure

This post explores how to build a Storage Space Direct lab in Azure. Bear in mind that S2D in Azure is not a supported scenario for production workloads as of yet.

Following are the high level steps that needs to be followed in order to create provision a S2D lab in Azure. For this lab, I’m using DS1 V2 VMs with Windows Server 2016 Datacenter edition for all the roles and two P20 512 GB Premium SSD disks in each storage node.

Create a VNET

In my Azure tenant I have created a VNET called s2d-vnet with 10.0.0.0/24 address space with a single subnet as below.

1-s2d-create-vnet

Create a Domain Controller

I have deployed a domain controller called jcb-dc in a new windows active directory jcb.com with DNS role installed. Once DNS role has been installed, I have changed the DNS server IP address in the s2d-vnet to my domain controller’s IP address. You may wonder what is the second DNS IP address. It is actually the default Azure DNS IP address added as a redundant DNS server in case if we lose connectivity to the domain controller. This will provide Internet name resolution to the VMs in case domain controller is no longer functional.

1-s2d-vnet-dns

Create the Cluster Nodes

Here I have deployed 3 Windows Server VMs jcb-node1, jcb-node2 and jcb-node3 and joined them to the jcb.com domain. All 3 nodes are deployed in a single availability set.

Configure Failover Clustering

Now we have to configure the Failover Cluster. I’m installing the Failover Clustering role in all 3 nodes using below PowerShell snippet.

$nodes = (“jcb-node01”, “jcb-node02”, “jcb-node03”)

icm $nodes {Install-WindowsFeature Failover-Clustering -IncludeAllSubFeature -IncludeManagementTools}

3-s2d-install-fc

Then I’m going to create the Failover Cluster by executing below snippet in any of the three nodes. This will create a Failover Cluster called JCB-CLU.

$nodes = (“jcb-node01”, “jcb-node02”, “jcb-node03”)

New-Cluster -Name JCB-CLU -Node $nodes –StaticAddress 10.0.0.10

4-s2d-create-fc

Deploying S2D

When I execute Enable-ClusterS2D cmdlet, it will enable Storage Paces Direct and start creating a storage pool automatically as below.

5-s2d-enable-1

5-s2d-enable-2

12-s2d-csv

You can see that the storage pool has been created.

7-s2d-pool-fcm

8-s2d-pool

Creating a Volume

Now we can create a volume in our new S2D setup.

New-Volume -StoragePoolFriendlyName S2D* -FriendlyName JCBVDisk01 -FileSystem CSVFS_REFS -Size 800GB

9-s2d-create-volume

Implementing Scale-out File Server Role

Now we can proceed with SOFS role installation followed by adding SOFS cluster role.

icm $nodes {Install-WindowsFeature FS-FileServer}

Add-ClusterScaleOutFileServerRole -Name jcb-sofs

10-s2d-sofs-install

11-s2d-sofs-enable

Finally I have created an SMB share called Janaka in the newly created CSV.
13-s2d-smb-share

Automating S2D Deployment in Azure with ARM Templates

If you want to automate the entire deployment of the S2D lab you can use below ARM template by Keith Mayer which will create a 2-node S2D Cluster.

Create a Storage Spaces Direct (S2D) Scale-Out File Server (SOFS) Cluster with Windows Server 2016 on an existing VNET

This template requires you to have active VNET and a domain controller deployed first which you can automate using below ARM template. 

Create a 2 new Windows VMs, create a new AD Forest, Domain and 2 DCs in an availability set

We will discuss how to use DISKSPD & VMFLET to perform load and stress testing in a S2D deployment in our next post.

New Security Features in Azure Backup

Recently Microsoft has introduced new security capabilities to Azure Backup which allows you to secure your backups against any data compromise and attacks. These features are now built into the recovery services vault and you can enable and start using them within a matter of 5 minutes.

Prevention

For critical operations such as  delete backup data, change passphrase, Azure Backup now allows you to use an additional authentication layer where you need to provide a  Security PIN which is available only for users with valid azure credentials to access the backup vaults.

Alerting

You can now configure email notifications to be sent for specified users for operations that have an impact on the availability of the backup data .

Recovery

You can configure Azure backup to retain deleted backup data for 14 days where you can recover the deleted data using the recovery points. When enabled, this will always maintain more than one recovery point so that there will be enough recovery points from which you can recover the deleted data.

How do I enable security features in Azure Backup?

These security features are now built into the recovery services vault where you can enable all of them with a single click.

1-enable-azure-backup-security

Following are the requirements and considerations that you should be aware of when you enable these new security features.

  • The minimum MAB agent version should be 2.0.9052 or you should upgrade to this agent version immediately after you have enabled these features.
  • If you are using Azure Backup Server the minimum MAB agent version should be 2.0.9052 with Azure Backup Server upgrade 1
  • Currently these settings won’t work with Data Protection Manager and will only be enabled with future Update Roll-ups.
  • Currently these settings won’t work with IaaS VM Backups.
  • Enabling these settings is a one-time action which is irreversible.

Testing new security features

In below video I’m trying to change the passphrase of my Azure Backup agent and save it. Note that here I will have to provide a Security PIN in order to proceed or otherwise the operations fails. 

Next I’m going to setup backup alerts for my recovery services vault. Once I create an alert subscription I’m going to delete my previous backup schedule. Here I will have the chance of restoring the data within 14 days after deletion.

Multiple vNICs in Azure ARM VMs

I have recently faced with a challenge to add multiple vNICs to an ARM based Azure VM. The requirement was to add a secondary vNIC while keeping the first one intact. This post explores how  I achieved this task with PowerShell.

Before you do anything make sure that the existing vNIC Private IP address has been set to static as below. Otherwise once the VM Update operation is completed you will lose that IP address. Optionally you can set the Public IP address to reserved (static) as well. The VM should already have at least two NICs and it is not supported to pass from a single NIC VM to multiple NIC VM and vice versa.

1-vnic-private-ip

2-vnic-public-ip
Create a new vNIC

I have defined the properties for the new vNIC as below.

$VNET = Get-AzureRmVirtualNetwork -Name ‘vmnic-rg-vnet’ -ResourceGroupName ‘vmnic-rg’
$SubnetID = (Get-AzureRmVirtualNetworkSubnetConfig -Name ‘default’ -VirtualNetwork $VNET).Id
$NICName = ‘jcb-vnic02’
$NICResourceGroup = ‘vmnic-rg’
$Location = ‘East US’
$IPAddress = ‘10.0.0.7’

Next step is to create the new vNIC.

New-AzureRmNetworkInterface -Name $NICName -ResourceGroupName $NICResourceGroup -Location $Location -SubnetId $SubnetID -PrivateIpAddress $IPAddress

3-vnic-create

Adding the new vNIC to an existing VM

I’ve executed below PowerShell snippet which sets the existing vNIC as primary and and updates the VM once the new vNIC is in place.

$VMname = ‘jcb-nicvm01’
$VMRG =  ‘vmnic-rg’

$VM = Get-AzureRmVM -Name $VMname -ResourceGroupName $VMRG

$NewNIC =  Get-AzureRmNetworkInterface -Name $NICName -ResourceGroupName $NICResourceGroup
$VM = Add-AzureRmVMNetworkInterface -VM $VM -Id $NewNIC.Id

Then I’m listing the attached NICs in the VM and setting the first one as primary.

$VM.NetworkProfile.NetworkInterfaces

$VM.NetworkProfile.NetworkInterfaces.Item(0).Primary = $true
Update-AzureRmVM -VM $VM -ResourceGroupName $VMRG

 

Locking Resources with ARM

Sometimes you need to restrict access to an Azure subscription, resource group or a resource in order to prevent accidental deletion or modification of same by other users. With Azure resource Manager you can lock your resources in two levels.

  • CanNotDelete Authorized users can read and modify a resource, but they can’t delete it.
  • ReadOnly Authorized users can read from a resource, but they can’t delete it or perform any actions on it. The permission on the resource is restricted to the Reader role.

The ReadOnly lock be trick in certain situations. For an example a ReadOnly lock placed in a storage account  prevents all users from listing the keys as the list keys operation is handled through a POST request since the  returned keys are available for write operations. When you apply a lock at a parent, all child resources inherit the same lock. For an example if you apply a lock in a resource group all the resources in it will inherit same and even resources you add later will inherit same.

Locking with PowerShell

Following snippet demonstrates how you can apply a resource lock using PowerShell.

New-AzureRmResourceLock –LockLevel <either CanNotDelete or ReadOnly> –LockName <Lock Name> –ResourceName <resource name> –ResourceType <resource type> –ResourceGroupName <resource group name>

Here you should provide the exact resource type. For a complete list for available Azure resource providers please refer this article.

Azure Resource Policies | Part 2

In my last post we discussed what Azure Resource Policies are and how it can help you to better manage your Azure deployments. Now it’s time to understand how to practically implement and use resource policies in Azure.

Control Virtual Machines sku in a resource group

In this example we are denying the creating VMs other than Standard A1 sku in a resource group by applying a custom resource policy.

First we create a resource group to apply this policy.

$ResourceGroup = New-AzureRmResourceGroup -Name protectedrg -Location “South East Asia”

Next we are going to define our security policy. This policy allows only Standard A1 VMs to be created when applied to a resource group.

$PolicyDefinition = New-AzureRmPolicyDefinition –Name vmLockPolicy -DisplayName vmLockPolicy -Description “Do not allow the creation of Virtual Machines” -Policy ‘{
“if”: {
“allOf”: [
{
“field”: “type”,
“equals”: “Microsoft.Compute/virtualMachines”
},
{
“not”: {
“field”: “Microsoft.Compute/virtualMachines/sku.name”,
“in”: [ “Standard_A1” ]
}
}
]
},
“then”: {
“effect”: “deny”
}
}’

Next we are going to assign the policy to our newly created resource group. First we are retrieving the subscription name and resource group name to which the custom policy to be assigned.

$Subscription = Get-AzureRmSubscription -SubscriptionName “MVP Personal”
$ResourceGroupName = $ResourceGroup.ResourceGroupName

Now we can assign the policy accordingly.

$AssignPolicy = New-AzureRmPolicyAssignment -Name vmLockPolicyAssignment -PolicyDefinition $PolicyDefinition -Scope /subscriptions/$Subscription/resourceGroups/$ResourceGroupName

Let’s try to create a virtual machine of DS1 v2 sku in the protectedrg resource group as below.

arm-policy-vm-creation-1 However the deployment activity fails as per our custom resource policy.

arm-policy-vm-creation-2

In the next post let’s discuss resource locking in Azure.

SQL RP Installation Failure in Azure Stack TP1 | Fix It

Myself & my good friend CDM MVP Nirmal Thewarathanthri have been experimenting with Azure Stack for a while now. Although we tried more than 30 times to install SQL Resource Provider in our Azure Stack lab it was never quite successful. The biggest problem is cleaning up the Azure Stack environment every time after a failure as sometimes we had to do a fresh install from scratch.

The Epic Failure

Following were the symptoms of this issue.

  • The SQL VM installs just fine.
  • Deployment always fails at DSC configuration in the SQL VM.
  • The URL of the ARM template for SQL VM seems no longer valid as you can see.

Here is the full description of the error that we encountered.

VERBOSE: 8:54:27 AM – Resource Microsoft.Compute/virtualMachines/extensions ‘sqlrp/InstallSqlServer’ provisioning
status is running
New-AzureRmResourceGroupDeployment : 10:29:12 AM – Resource Microsoft.Compute/virtualMachines/extensions
‘sqlrp/InstallSqlServer’ failed with message ‘The resource operation completed with terminal provisioning state
‘Failed’.’
At D:\SQLRP\AzureStack.SqlRP.Deployment.5.11.61.0\Content\Deployment\SqlRPTemplateDeployment.ps1:207 char:5
+     New-AzureRmResourceGroupDeployment -Name “newSqlRPTemplateDeploym …
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [New-AzureRmResourceGroupDeployment], Exception
    + FullyQualifiedErrorId : Microsoft.Azure.Commands.Resources.NewAzureResourceGroupDeploymentCommand

New-AzureRmResourceGroupDeployment : 10:29:12 AM – An internal execution error occurred.
At D:\SQLRP\AzureStack.SqlRP.Deployment.5.11.61.0\Content\Deployment\SqlRPTemplateDeployment.ps1:207 char:5
+     New-AzureRmResourceGroupDeployment -Name “newSqlRPTemplateDeploym …
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [New-AzureRmResourceGroupDeployment], Exception
    + FullyQualifiedErrorId : Microsoft.Azure.Commands.Resources.NewAzureResourceGroupDeploymentCommand

The issue in this case was the unstable Internet connection we had. The ARM template for SQL RP downloads the SQL 2014 ISO first. In our case timeout in the download has stopped the entire process. Once the VM was created SLQ Server 2014 wasn’t installed in it.

To solve this issue we followed below procedure in a fresh installation of MAS TP1. You can try this out in an existing installation with failed SQL RP deployment but there’s no guarantee that you will be able to cleanup the existing resource group. If you have executed the SQL RP installation only once clean up may work but if you have tried it multiple times there’s a high chance of failing to cleanup the existing resource group/s.

  1. Download the SQL image from here.
  2. Open the default PIR image. This is available in the MAS TP1 host  \\sofs\Share\CRP\PlatformImage\WindowsServer2012R2DatacenterEval\WindowsServer2012R2DatacenterEval.vhd
  3. Once you mount the VHD (simply double click to mount), create new Folder called  SQL2014 on the PIR image under C:\ drive
  4. Copy all files from the downloaded ISO into the folder SQL2014
  5. Start the deployment script. If you are trying this on an existing failed deployment, then  re-run the deployment after cleaning up the existing resource group/s for SQL RP.

Once all the deployment tasks are completed you can see a successfully deployed SQL Resource Provider in the portal as below.

SQL RP Success (1)

You can refer the MSFT guide on how to add a SQL resource provider in MAS TP1 deployment here for more information.

Azure Resource Policies | Part 1

Any data center should adhere to certain organizational compliance policies whether it is on-premises or cloud. If your organization is using Microsoft Azure and want your resources to adhere resource conventions and standards that govern the data center policy of your organization how would you do that? For an example you want to restrict person A to not to create VMs larger than Standard A2.  The answer would be to leverage custom resource policies and assigning them at the desired level, be it a subscription, resource group or an individual resource.

Is it same as RBAC?

No it isn’t. Role Based Access Controls in Azure is about actions a user or a group can perform while policies are about actions that can be applied at a resource level.  As an example RBAC sets different access levels in different scopes while policies can control what type of resources that can be provisioned or which locations those resources can be provisioned in an resource group/subscription. These two work together as in order to use a policy a user should be authenticated through RBAC.

Why do we need custom policies?

Imagine that you need to calculate chargeback for your Azure resources by team or department. Certain departments will need to have a limited consumption imposed and you need to charge the proper business unit at the end. Also if your organization wants to restrict what resource or where they are provisioned in Azure. For an example you want to impose a policy that allows user to create Standard A2 VMs only in West Europe region. Another good example is that you want to restrict creating load balancers in Azure for all the teams except the network team.

Policy Structure

As all ARM artifacts policies are also written in JSON format which contains a control structure. You need to specify a condition and what to perform when that condition is met simply like an IF THEN ELSE statement. There are two key components in a custom Azure resource policy.

Condition/Logical operators which contains a set of conditions which can be manipulated through a set of logical operators.

Effect which describes the action that will be performed when the condition is satisfied, either deny, append or audit.  If you create an audit effect it will trigger a warning event service log. As an example your policy can trigger an audit if someone creates a VM larger than Standard A2.

  • Deny generates an event in the audit log and fails the request
  • Audit generates an event in audit log but does not fail the request
  • Append adds the defined set of fields to the request

Following is the simple syntax for creating an Azure Resource policy.

{
“if” : {
<condition> | <logical operator>
},
“then” : {
“effect” : “deny | audit | append”
}
}

Evaluating policies

A policy will be evaluated at the time of resource creation or when a template deployment happens using a HTTP PUT request. If you are deploying a template, it will be evaluated during the creation of each resource in the template.

In the next post let’s discuss some practical use cases of using Azure resource policies to regulate your resources.