Business Continuity & Disaster Recovery is a critical functionality in today’s IT business. Microsoft Azure Site Recovery is premier solution that can reduce your BCDR cost drastically. Planing a DR solution requires lot of effort and time and using Azure Site Recovery enterprises can have a bullet proof BCDR solution within couple of days where you only pay when a disaster has actually happened.
In Savision’s newest free whitepaper MVP Peter de Tender explains why you should focus on building an effective Disaster Recovery Plan for your virtualized Data center. This whitepaper explains,
- How to leverage Microsoft Azure Site Recovery to build a DR solution for Hyper-V?
- Azure Site Recovery for VMWare & Physical servers
- Leveraging Azure IaaS for a hybrid data center,
You can download the whitepaper from here.
Azure Site Recovery is a great product for those who want to setup their DR environment with a minimal cost. It is based on Hyper-V replica technology for Hyper-V workloads and supports replication VMware & Physical server workloads to DR as well. Today I’m going to discuss a common issue one can encounter when enabling ASR replication to the cloud.
I’ve been working on an ASR setup during couple months and encountered strange issue when I enabled replication in protected VMs.
The enable protection job fails with below error.
Job ID: f9f84765-b18c-4002-96a4-d420dfb76ea6-2015-05-14 10:00:29Z
Start Time: 5/14/2015 3:30:29 PM
Duration: 10 MINUTES
Protection couldn’t be enabled for the virtual machine. (Error code: 70094)
Provider error: Unable to complete the request. Operation on the <Hyper-V Node> timed out.
Try the operation again. (Provider error code: 2924)
Possible causes: Protection can’t be enabled with the virtual machine in its current state. Check the Provider errors for more information.
Recommendation: Fix any issues in the Event Viewer logs (Applications and Service Logs – MicrosoftAzureRecoveryServices) on the Hyper-V host server. If this virtual machine is enabled for replication on the Hyper-V host, disable this setting. Then try to enable protection again.
UTC Time: Thu May 14 2015 10:15:59 GMT+0530 (Sri Lanka Standard Time)
Browser: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36
Portal Version: 5.4.00298.11 (rd_auxportal_stable.150511-1702)
Email Address: firstname.lastname@example.org (MSA)
In the particular Hyper-V host following error has been logged in Event logs.
Enable replication failed for virtual machine ‘XXXXXX’ due to a network communication failure. (Virtual Machine ID 807780f6-bb7c-48d5-937d-4857a654dec3, Data Source ID 2256321007502018113, Task ID 8c1a5d7d-0693-4d6b-9243-37cc5e96a7d6)
This ASR setup was a on-premise to Cloud scenario with a single SCVMM server.
After spending a good number of troubleshooting hours I finally figured out what went wrong. The Hyper-V Hosts themselves need Internet connectivity to replicate the VMs to ASR. If you cannot enable direct Internet connectivity on the Hyper-V hosts you should do so via a proxy setup. You can change the proxy settings in ASR Provider in Hyper-V Host.
ASR replication requires traffic to be sent over port 443 (SSL) and in my case only the SCVMM server was configured with Internet access. If you are using a proxy server you may need to consider allowing below for successful replication.
- Allow the IP addresses in Azure Datacenter IP Ranges and HTTPS (443) protocol. Also your IP address whitelist should contain that of your primary region and West US IP address ranges.
One of the most annoying problems with ASR is that it cannot protect VMs with VHDs which has capacity greater 1 TB. Now we know the OS drive limitation has been already addressed by Microsoft (refer my previous blog post) but still 1 TB cap is there for data disks. This is a limitation in Azure itself as of now. Let’s see a workaround that we can leverage to overcome this barrier.
This solution involves creating new striped disk which consists of the creation of a new striped disk drive consisting multiple smaller VHD images less than 1 TB each. Here we are copying the data from the old VHD to the new striped volume and the remove the old VHD.
- VM should be in shutdown state.
- Required number of 1 TB VHDs should be added to the VM that can accommodate the size of the VHD which is greater than 1 TB. Keep in mind these VHDs are dynamic not fixed.
- Start the VM and stop any application services that are running i.e SQL
- Go to Computer Management > Disk management tab. If prompted to initialize the new VHDs click OK and proceed.
- Right-click one of the new unallocated volumes, and then click Create striped volume.
- Select all the new volumes that are displayed in the wizard to create a striped volume.
- Assign a temporary drive letter (i.e, F:) to the new drive and format the new drive to NTFS
- Now you can copy the data between the two drives. Use below robocopy command to do so.
robocopy E:\ F:\ /mir
- Change the drive letter of the new disk to the drive letter of the old disk. (Swap the drive letters)
- In a PowerShell window (as administrator) run diskpart.
- Type SAN POLICY=OnlineAll.
- Shut down the VM and remove the old VHD image from the VM.
- Start the VM and the services that you’ve stopped earlier, and try to protect the VM with Site Recovery now.
We had a basket full of new features from Microsoft Azure within last two weeks and as Azure Consultant it’s becoming harder for me to even close my eyes knowing that something new will be there in the morning. Today I thought of sharing a glimpse of two cool additions to Microsoft Cloud which enhances IT PRO productivity.
PowerShell Support for Azure Site Recovery
This is a much awaited update. As of today Azure Site Recovery can be fully implemented using PowerShell. Imagine you want to generate an html report on your last failover. Using Get-AzureSiteRecoveryJob cmdlet for an instance you can write a nice script to achieve this.
For a full reference of available Azure Site Recovery PowerShell cmdlets refer this TechNet article. Note that all these cmdlets are only available in Azure PowerShell October 2014 package so it’s time you update your binaries.
Network Security Groups
Network Security Groups are now GA which provides you to create more granular control in your VM networks such as implementing DMZs and Network Segmentation. Ideally if you are hosting a 3-Tier application in Azure and wants to implement strict traffic filtering in each tier this is the ideal solution for you.
Currently you can leverage this only via PowerShell or REST APIs. Also there are few limitations that you should consider in Network Security groups. But yet again the platform is evolving and Azure Team is working very hard to overcome any obstacles.
For a complete reference of the new feature refer the announcement in Azure Blog.
Hi folks, we have successfully deployed an ASR solution throughout this series and today we are going to look at some FAQs people have. First lets take a look at how to clean the demo environment that we setup for ASR.
Cleanup your ASR Deployment.
There are actually two ways to do this.
- Remove the VMM server from registered servers in ASR Vault. This will disassociate the VMM server with your ASR vault. But yet again you will need to remove storage account and ASR vault manually from Azure portal if you don’t require them any longer. Although you may need to remove obsolete registry entries of the VMM server as described in here.
- Use the Cleanup script from TechNet. This is useful when you no longer have access to the Azure account. It’s quite simple actually. All you have to do is run the PowerSehll Script in the VMM server that has been registered with ASR. Actually this script will remove the registration information and cloud pairing information of the protected clouds of the VMM Server from Azure. Assuming you are not the administrator of the Azure account using this script is much more safer.
Also you can disable the protection of VM separately. Refer this article from TechNet in order to achieve this.
Lets take a look on some common FAQs related to ASR
Q. I have a existing VMware environment. Can I leverage ASR for DR in my environment?
A. Yes & No. For VMware workloads Microsoft has a separate product called Microsoft Migration Accelerator. You can use this to move your VMs to the cloud from AWS or VMware. In order to provide replication Inmage Scout by Microsoft is the best tool.
Q. What are the system requirements for ASR?
A. Note that ASR doesn’t support VHDX file format yet as it is not available is Azure still. Also there are number of compliant Linux distros that are supported as Guest OS.
- An Azure Subscription
- Management certificate
- System Center Virtual Machine Manager 2012 R2
- Windows Server 2012 R2 Hyper-V – used as VM host
- Gen 1, fixed disk .vhd VMs in Hyper-V
- Guest OS Windows Server 2008 or later
Q. I have full System Center suite deployed in my environment. How can I leverage that with ASR?
A. As all our system center products support Microsoft Azure, this depends on how you want use them with Azure. For an example you can use Orchestrator runbooks for automatic fail over to Azure, SCOM to generate alerts during fail over window etc…
Q. How can I get the pricing information for ASR?
A. Visit this link to learn more about the product and pricing
Azure Site Recovery is a emerging and continuously evolving product. With the announcement of vNext of System Center & Windows Server platform we can expect lot of new & exciting features with Azure pretty soon.
To test the ASR fail over you can create a Recovery Plan which contains multiple VMs or test the fail over for a single VM. However it is important that you create recovery plans for your production environment in order to specify how the fail over should happen. Let’s take each scenario and see how we can address them.
- Should your require to RDP after you fail-over your VM to Azure, you must enable Remote Desktop Connection in on-premise VM before you do test or actual fail over. This is a MUST. Also same should be done with SSH access for Linux VMs.
- After the test fail over you’ll use a public IP address to remote into the Azure VM. In this case make sure your firewall doesn’t restrict that address. If you have configured the network mapping then you’ll get a private IP from your on-premise network. So you still RDP into LAN. I’d use this only after I have created a production recovery plan and use same for actual fail over.
Test Fail over from on-premise to Azure
In a test fail over you simulate the actual fail over sequence before you move it into production. For that you’ll need to choose between an existing isolated VM network or you can select NONE so that your test fail over will be done in an isolated environment without a VM network attached. Note that once you complete the test, Microsoft Azure will remove all the items like the test VM that was used to simulate the test.
- Select the VM network that you want to test the ASR. This should be a separate VM network or you can select NONE.
- You’ll need to create an End Point to allow RDP to the test VM. It’s pretty much straight forward as below.
- Once RDP end point has been created you can try remote into the test VM and see everything fits.
- Remember you still need to complete the test in order to clear the test deployment as below.
- It will prompt you to comment on the test scenario. Select the check box to clean up the environment.
- Once the environment is cleaned and the test is done you can see the status of the test fail over as all complete.
Creating Recovery Plan
For production deployment you’ll have to create a recovery plan. It will run a sequence of actions that you define to perform the fail over. You can customize the plan to include VM groups to specify start sequences, scripts to run once fail over sequence starts etc… Refer this article from TechNet for more information on customizing the recovery plan.
- Select Source and Target followed by the VMs that needs to be included in this recovery plan.
Performing a production fail over
Planned fail over is always pre – planned. For an example you want fail over your corporate web site at the end of each month when you update the web site. In such situations you can leverage ASR to plan it first hand, which VMs start first, any clean up tasks that should run etc.. Additionally you can configure an orchestrator workflow to automate same if you are using system center.
Unplanned fail over is more robust. This can be leveraged in a disastrous situation. It is something that you do not expect but you can create DR recovery plan for this
Let’s see how to work with a planned fail over. Note that steps are same for both planned and unplanned.
- Select Planned Fail over from the menu. Here I have only select the VM that I need to fail over. You can select a recovery plan that include multiple VMs if you need.
- Select the fail over direction. Since this is fail over it would be from on-premise to Azure.
- Let the process continue. As this depends on the network speed and size of the VM this may take some time.
- Once the fail over is done as above you’ll have to commit the fail over. Once it is committed your VM will be up and running from the cloud.
Fail back to original state
Once you completed the fail over you’ll need to move back to on-premise VM. Let’s see how you can do that. I have done this is again for a single VM not for a recovery plan to save time.
- You may notice that the VM in on premise VMM is in a halt status once the fail over has been done.
- Select the VM from Recovery Services and select the fail over accordingly. If you did planned before select planned again. I have selected the second option to minimize the fail over time but it’s up to you whether you want to go back to pre – fail over state or not.
- You’ll notice that several actions are skipped in the fail back job as we are syncing only the data that has been changed during the fail over window.
- If you have selected Synchronize data before fail over option additionally you’ll have to select Complete Fail Over button in the job window.
- To complete the fail over click Commit Button.
We have successfully configured fail over scenario from on premise Azure. In the last post of this series lets discuss the best practices, hiccups and how to clean the ASR configuration in VMM in a demo VMM environment.
In this post we are going to see how to setup ASR for VMM clouds. For this scenario I have used one VMM cloud called ASR Cloud that contains one Windows Server 2012 R2 Standard VM. Also I have another Linux VM in this cloud that is also protected using ASR.
Create the Azure Site Recovery Vault
- Login to the Azure Portal
- Click NEW
- Click Recovery Services
- Select Site Recovery Vault
- Give it a proper name and select your region of preference. Make sure you use the same region across all your ASR components to avoid sync issues.
Creating a Storage account for ASR
Creating a storage account is pretty much straight forward. But make sure you select the same region as your ASR vault and Geo Redundant as the replication type.
Installing the Azure Site Recovery Provider in VMM Server
- Click the Site Recovery Provider you just created and select “Between an on-premises site and Microsoft Azure” from the drop down list.
- Click generate Registration Key and save the registration key.
- In the Dashboard select and download the Microsoft Azure Site Recovery Provider for Installation VMM servers
- Double click the setup file and install the Provider.
- Select the check box to automatically restart VMM service after the installation.
- Opt out Windows Update if you don’t wish to proceed with the automatic ASR provider updates.
- If you have a proxy enabled for Internet connect specify the settings.
- Select the generated certificate so it will detect the vault automatically.
- Select Synchronize Cloud Meta data check box so that it will sync the metadata of your VMM clouds with ASR. This step is optional so consider this if it doesn’t conflict with your organizational policy. This happens only once and you can sync each cloud in your VMM individually in the cloud properties of VMM.
- Data Encryption option allows you to provide a SSL certificate that will be used in data decryption between On-premise to Azure. This needs to be kept safe as you will need it to perform a fail over from on-premise to cloud if you supplied one here.
- The ASR Provider is complete and the wizard will tell you if your VMM sever has been successfully registered with the ASR service or not.
Installing the ASR Agent for Hyper-V Hosts
- Click the Site Recovery Provider you just created and download the Agent for Hyper-V Hosts from the main page.
- Make sure pre-requisites are met before proceeding with the installation.
Configuring Cloud Protection
Now that you have successfully installed everything required it’s time to configure the protection settings for the cloud that you need to be protected by ASR.
- Click Set up protection for VMM clouds on the quick start page
- Click the cloud that you need to protect from Protected Items tab and then click Configure.
- Select Microsoft Azure as the target.
- Select the storage account you have created for ASR VMs.
- Turn off Encrypt stored data – whether data should be encrypted and then replicated between the on-premises site and Azure.
- Leave the default value for Copy frequency – This defines the replication frequency between the two sites. Once this has been set it can be only changed from here itself not from VMM.
- Leave the default value for Retain recovery point – Zero means that only the latest recovery point for a VM is stored on a replica host server.
- Leave the default value for Frequency of application-consistent snapshots. – This value defines how often you need to create snapshots for your VMs. If you supply a value here make sure it is always less than the number of recovery points specified earlier.
- Enter a value for Replication start time for the initial replication of data to Azure.
Once you click save it will start the initial replication between your on-premise VMM server to Azure. Note that this will take some time depending the size of the VHD. If you need to view the status of the replication you can click Jobs tab and see individual details for each job.
Configure Network Mapping
You can safely test the fail over in an isolated environment. However for a planned or unplanned fail-over to happen seamlessly you should map an Azure VM network with your on-premise VMM VM network. You can follow below two guides from Microsoft to setup Network Mapping.
- Prepare for network mapping
- Configure network mapping
Enable Protection on VMs
Now we can enable protection on VMs. Please note that some of the below screenshots refer the Linux machine I have on my VMM cloud but don’t worry steps are same for Windows server VMs as well.
- Select the VM from VMM and click properties
- Click Configure Hardware section
- Click Hyper-V Recovery Manager and click the Enable Hyper-V Recovery Manager protection for this VM check box.
- Select the desired Replication Frequency. This is defined in Protection settings of the Cloud in Azure portal.
- Click OK to update the configuration change on VM.
- Go to Azure portal and select the ASR Service, on the Virtual Machines tab in the cloud in which the virtual machine is located, click Enable protection and then select Add virtual machines.
- From the list of virtual machines in the cloud, select the one you want to protect.
Job progress can be viewed from Jobs section.
In the next post I’m going to demonstrate how to configure recovery plans, perform test, planned or unplanned failovers for your VMs.
Today we start a new blog post series on Azure Site Recovery. In this series we are going to implement a DR solution for Hyper-V VMs in a VMM cloud. The series is a collection of 4 posts where I’ll guide you through each step in the process. Note that this is just a Proof-of-Concept lab where I’ve used minimal resources to setup.
In this setup we will be replicating one Linux VM from a Hyper-V cluster environment to Azure for DR purposes. This VM contains a sample Hello World page in an apache web server.
First Things First
This is the checklist that you want to have for this scenario.
- Azure Account – An active Azure subscription. You can also use a free trial.
- Storage Account – This should be Geo-Replicated in the same region as the Recovery Site service.
- VMM Server – Should be System Center 2012 R2
- VMM Clouds – At least one VMM cloud with one or more VMM Host Groups, Hyper-V host servers or clusters in each host group and one or more Generation 1 VMs. Please see here for the compatibility matrix for VMs.
Lets take a look at the tasks that need to be performed in an overview.
- Create an Azure Site Recovery Vault
- Install Azure Site Recovery Provider & Generate a registration key
- Configure Azure Storage Account
- Install ASR agent on Hyper-V Hosts
- Configure Cloud protection
- Configure network mapping – map source VM networks to target Azure Virtual networks
- Enable VM protection
- Test run – run a test fail-over or create a recovery plan and r un a test fail-over for same.
Lets discuss how to create the ASR vault & install the ASR Provider in our next post.