Category Archives: Azure Site Recovery

Process Server fails to communicate after ASR Update Rollup 22

Recently I have been working on an ASR implementation for a customer, where it was required to upgrade the MARS version from 9.8 to 9.13. Microsoft has set a deadline of 28th February for this update as after that enabling replication using older version of MARS wouldn’t work. Here is what you see when you are prompted to perform this upgrade.

Since we were running an incompatible version of MARS (Microsoft claims that you need to have a n-4 version of DRA in order to successfully perform this update) we had to perform a step upgrade from 9.8 to 9.10 and then to 9.13. This link provides to reference to that.

The Issue

Both process servers in the environment have been successfully upgraded to 9.13 as a step upgrade. But as soon as the latest version was installed, both the config server and the secondary process server lost communication with the ASR vault and refresh server connection task has been failing for no reason.

We have raised this with Microsoft support and the support engineer assigned to our case found out one alarming issue under ASR operational log under Windows Event viewer in the config server.

 

System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.IdentityModel, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.  File name: 'Microsoft.IdentityModel, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'    

at SrsRestApiClientLib.SrsCreds.InitializeServiceUrl()    

at SrsRestApiClientLib.SrsCreds..ctor(String resourceId, String siteId, String draId, X509Certificate2 cert, AcsConfiguration acsConfig, Boolean retryFetch, String apiVersion, Proxy webProxy, String msiVersion, String vmmVersion)    

at SrsRestApiClientLib.ClientHelper.InitializeDra(String resourceId, String siteId, String draId, X509Certificate2 cert, AcsConfiguration acsConfig, String apiVersion, Proxy webProxy, String msiVersion, String vmmVersion)    

at Dra.SrsCommunication.SrsCommunicationClient.InitializeSrsCommunication()    

at Dra.SrsCommunication.SrsCommunicationClient..ctor(IDraFabricAdapter fabricAdapter)    

at Dra.Dra.Initialize(IDraFabricAdapter fabricAdapter)    

at Dra.DraFactory.CreateInstance(IDraFabricAdapter fabricAdapter)    

at Microsoft.VirtualManager.Engine.DRA.Core.Core.Initialize()   

WRN: Assembly binding logging is turned OFF.  To enable assembly bind failure logging, set the registry value [HKLM\Software\Microsoft\Fusion!EnableLog] (DWORD) to 1.  Note: There is some performance penalty associated with assembly bind failure logging.  To turn this feature off, remove the registry value [HKLM\Software\Microsoft\Fusion!EnableLog].  ,{00000000-0000-0000-0000-000000000000}



======================

The Culprit

As you may have already notice there is an exception logged stating that the DRA cannot find ‘Microsoft.IdentityModel, Version=3.5.0.0’, which is indeed a part of Windows Identity Foundation 3.5 role in Windows Server. When we checked the installed roles in the config server this role wasn’t present and we have enabled it again to see whether it would have made any difference. Viola!, the process servers reestablished the communication to the vault soon after this role has been installed.

Aftermath

ASR engineering team has confirm that the DRA has a pre-requisite to check whether the config server has Microsoft .NET framework 4.5 installed. Furthermore with .NET 4.5, WIF role is fully integrated into the .NET Framework, meaning that it should be automatically installed alongside .NET 4.5 should have been present in this case. They have reproduced the issue and verified that in upgrade from 9.10 to 9.13, no issues are observed as there are no options available to disable WIF along with .NET Framework 4.5 installation (DRA runs ion a minimum of .NET 4.5 and the latest requirement supports Recently .NET framework 4.6.2).

My best bet is that this has happened when we performed the upgrade from 9.8 to 9.10 as we haven’t reproduced that possibility. The config server haven’t had any major change in its configuration so the only suspect is the 9.8 to 9.10 upgrade. Well I’m exploring that possibility right now and will be publishing a another post if my theory is confirmed. But until then, it still remains a mystery.

Protecting Azure IaaS VMs with Managed Disks with Site Recovery

Microsoft Azure Site Recovery team has just announced the capability to protect Azure IaaS VMs with managed disks using ASR as a public preview in all ASR enabled regions. This was an important steps as lot of customers were expecting to protect their VM workloads with managed disks, compared to the previous unmanaged disk scenario.

A2A Managed Disk Architecture

The challenge here was how do you replicate disk without requiring a VM object. To overcome this hurdle, ASR uses one storage account, in source region which will cache the disks to the target region. This enables ASR to create a replica managed disk in the target region for each VM protected in the primary region and this replica disk will be the data store for the source disk in the primary region. One important thing to note is that the initial replication between the source and target happens externally using a snapshot at the VM level, so are delta syncs.

Things to Remember

  • VM protection can be enabled via VM settings blade or in the recovery services vault settings.
  • If you have VMs with unmanaged disks that are currently protected by ASR, you need to disable protection and convert the VMs to managed disks first as conversion from unmanaged to managed disks while protected by ASR is not supported.
  • There is an option for you to selected the type of the replica disks, to standard or premium. See below screenshot.
  • Only one cache storage account is needed to so store the data changes from source to primary regions, but you can leverage multiple cache accounts per VM as well.

Azure Site Recovery | New Onboarding Experience for VMWare to Azure Workloads

Microsoft has introduced a new onboarding experience with the latest update to Azure Site Recovery service for VMware to Azure  workloads. In this blogpost we are going to explore how this new experience save time when you setup your ASR infrastructure on-premises.

Open Virtualization Format (OVF) template-based configuration server deployment

Previously you had to download and install the ASR configuration server package in to a VMWare VM running a supported OS which you have created earlier. With this update you will be using a OVF template, which you can directly import in a VMWare host as a guest VM that functions as the configuration server for your ASR setup. All the necessary software, except MySQL Server 5.7.20 and VMware PowerCLI 6.0, is pre-installed in this VM template.

Below video from Microsoft ASR product team explains how you can leverage the new OVF template setup.

Web Portal for Configuration Management

With this update, Microsoft has introduced a new web portal in the configuration server. All configuration servers deployed using the OVF template will use this portal modify the following settings.

New Mobility Service Deployment model

If you are familiar with VMWare to Azure scenario in ASR, then you know the difficulties of the mobility service push deployment method for your VMware VMs. Previously this required you to open firewall rules for WMI and File and Printer Sharing services in Windows for the protected VMs. The reason being that WMI and File and Printer Sharing services were used by ASR service to push install the mobility service on the protected VMs. Not every organization allows this firewall exception in production environments.

In the latest ASR release, VMware tools will be install/update mobility services on all protected VMware VMs replacing the need to open above mentioned services in firewall rules. One thing to keep in mind is that VMware tools based mobility service installation is available only if you update your configuration servers to version 9.13. xxxx.x.

Azure Site Recovery Comprehensive Monitoring

One of the challenges I had with Azure Site Recovery is to provide a Business Continuity Dashboard experience for the IT administrators in my customer organizations. Previously I was able to achieve somewhat of this task by creating Azure Dashboards that showcase the components of the ASR environment. However Microsoft has recently introduced an out-of-the-box comprehensive monitoring dashboard experience for Azure Site Recovery. This gives full visibility into whether business continuity objectives are being met for organizations and also a failover readiness model that monitors resource availability and suggests configurations based on best practices.

What’s new in Azure Site Recovery Monitoring

Below is the new dashboard you see when you navigate to the Overview section of your recovery services vault in Azure.

  • Enhanced vault overview page – The new vault overview page features a dashboard that presents everything you need to know to understand if your business continuity objectives are being met. In addition to the information needed to understand the current health of your business continuity plan, the dashboard features recommendations based on best practices, and in-built tooling for troubleshooting issues that you may be facing.
  • Replication health model – Continuous, real time monitoring of replication health of servers based on an assessment of a wide range of replication parameters.
  • Failover readiness model – A failover readiness model based on a comprehensive checklist of configuration and disaster recovery best practices, and resource availability monitoring, to help gauge your level of disaster preparedness.
  • Simplified troubleshooting experience – Start at the vault dashboard and dive deeper using an intuitive navigational experience to get in depth visibility into individual components, and additional troubleshooting tools including a brand new dashboard for replicated machines.
  • In-depth anomaly detection tooling to detect error symptoms, and offer prescriptive guidance for remediation.

Azure Site Recovery Updates | Support for Large Disks

Microsoft Azure recently announced the support for large disks up to 4 TB. Now Azure Site Recovery supports protecting on-premises VMs and physical servers with disks up to 4095 GB in size to Azure. Many customers use disks with more than 1 TB in capacity for various reasons. A good example would be SQL databases and file servers. The availability of large disks in Azure allows you to leverage ASR as a DR solution for your datacenter infrastructure. 

Large disks in Azure are available both in standard and premium tiers. Standard disks offer two sizes  S40 (2TB) and S50 (4TB) for both managed and unmanaged disks. If you have IO intensive workloads that require premium storage you can use P40 (2TB) and P50 (4TB)  for both managed and unmanaged disks.

Pre-requisites for protecting VMs with large disks in ASR

You need to make sure that your on-premises ASR infrastructure components are up-to-date before you  you start protecting VMs and/or physical servers with disks greater than 1 TB in size. 

VMware/Physical Servers  Install the latest update on the Configuration server, additional process servers, additional master target servers and agents.
SCVMM managed Hyper-V environments Install the latest Microsoft Azure Site Recovery Provider update on the on-premises VMM server.
Standalone Hyper-V servers not managed by SCVMM Install the latest Microsoft Azure Site Recovery Provider on each Hyper-V server that is registered with Azure Site Recovery.

Note that protecting Azure VMs with large disks is not a currently supported scenario. 

Azure Site Recovery Updates | Storage Spaces & Windows Server 2016

Microsoft has recently announced a preview for protecting Azure IaaS VMs with ASR. Now you can protect Azure VMs running Windows Server 2016 . Also ASR now supports protecting Azure IaaS VMs with Storage Spaces. Storage Spaces allow you to  improve IO performance by striping disks and to create logical disks larger than 4 TB. 

Following is a list of all supported OS versions that can be protected using ASR.

Windows
  • Windows Server 2016 (Server Core and Server with Desktop Experience)
  • Windows Server 2012 R2
  • Windows Server 2012
  • Windows Server 2008 R2 SP1 and above
Linux
  • Red Hat Enterprise Linux 6.7, 6.8, 7.0, 7.1, 7.2, 7.3
  • CentOS 6.5, 6.6, 6.7, 6.8, 7.0, 7.1, 7.2, 7.3
  • Ubuntu 14.04/16.04 LTS Server (only supported kernel versions)
  • SUSE Linux Enterprise Server 11 SP3
  • Oracle Enterprise Linux 6.4, 6.5 running either the Red Hat compatible kernel or Unbreakable Enterprise Kernel Release 3 (UEK3)

Azure Site Recovery updates | Managed Disks & Availability Sets

Azure Site Recovery team has made some significant improvements to the service during past couple of months. Recently Microsoft has announced the support for managed disks and availability sets with ASR. 

Managed Disks in ASR

Managed disks allow simplified disk management for Azure IaaS VMs and users no longer have to leverage storage accounts to store the VHD files. With ASR,  you can attach managed disks to your IaaS VMs during a failover or migration to Azure. Additional using managed disks ensure reliability for VMs placed in Availability Sets by guaranteeing that the failed over VMs are automatically placed in different storage scale units (stamps) to avoid any single point of failure.

Availability Sets in ASR

Site Recovery now supports configuring VMs into availability sets in ASR VM settings. Previously users had to leverage a script that can be integrated to the recovery plan to achieve this goal. Now you can configure availability sets before the failover so that you do not need to rely on any manual intervention.

Below are some considerations to be made when you are using these two features.

  • Managed disks are supported only in Resource manager deployment model.  
  • VMs with managed disks can only be part of availability sets with “Use managed disks” property set to Yes
  • Creation of managed disks will fail , if the replication storage account was encrypted with Storage Service Encryption (SSE). If this happens during a failover you can  either set “Use managed disks” to “No” in the Compute and Network settings for the VM and retry failover or disable protection for the vm and protect it to a storage account without Storage service encryption enabled.
  • Use this option only if you plan to migrate to Azure for any SCVMM managed/unmanaged Hyper-V VM’s Failback from Azure to on-premises Hyper-V environment is not currently supported for VMs with managed disks.
  • Disaster Recovery of Azure IaaS machines with managed disks is not supported currently.

Savision Free Whitepaper | MVP Peter de Tender

Business Continuity & Disaster Recovery is a critical functionality in today’s IT business. Microsoft Azure Site Recovery is premier solution that can reduce your BCDR cost drastically. Planing a DR solution requires lot of effort and time and using Azure Site Recovery enterprises can have a bullet proof BCDR solution within couple of days where you only pay when a disaster has actually happened.

In Savision’s newest free whitepaper MVP Peter de Tender explains why you should focus on building an effective Disaster Recovery Plan for your virtualized Data center. This whitepaper explains,

  • How to leverage Microsoft Azure Site Recovery to build a DR solution for Hyper-V?
  • Azure Site Recovery for VMWare & Physical servers
  • Leveraging Azure IaaS for a hybrid data center,

You can download the whitepaper from here.

 

Static IP configuration is missing in E2A Azure Site Recovery

Azure Site recovery is a great cost effective platform to host your DR sites. These days most of my time is spent on this technology and I’m experimenting on new things everyday. Troubleshooting ASR is not so easy as the information available is relatively low in some cases.

In one of my ASR deployments I have noticed below issue.

As per Microsoft Documentation for a SCVMM to ASR scenario we can enable the protected VM in ASR to have a predefined IP address from a mapped virtual network. The guidelines read as “If the network adapter of source virtual machine is configured to use static IP then the user can provide the IP for the target virtual machine. User can use this capability to retain the ip of the source virtual machine after a failover. If no IP is provided any available IP would be given to network adapter at the time of failover. In case the target IP provided by user is already used by some other virtual machine that is already running in Azure then the failover would fail.”

Now I have enabled replication on one VM and checked the configuration section and guess what the only option available was DHCP.

ASR DHCP 1Solution

ASR sees what VMM can see. In this case the on premise logical network didn’t have any static IP pool assigned to it. When I checked the VM properties in VMM I noticed it is also reflecting IP as DHCP.

ASR DHCP VM 2Below are the steps I’ve performed to overcome this issue.

  1. Create a static IP Pool for my logical network. As I didn’t use network virtualization I didn’t need to create a static IP pool for my VM network. You can follow this guide to create a static IP pool in a logical network.
  2. This static IP pool should be of the same range that you used for your VMs. If you click the Connection details button as in above screen you get get the actual IP address assigned in the OS level and determine the range.
  3. Next step is to refresh virtual machines. Once you refresh a VM and and check the Network adapter properties in SCVMM it will now display the IP as static.
  4. I have already replicated one VM. For that I had to disconnect it from the on-premise network (Connectivity > Not Connected in above screen) and connect it again to the same VM network. Then I did a VM refresh and et viola now I can see the static IP option in the ASR portal.

ASR DHCP 3