Backup Azure IaaS VMs with Azure Backup

We have an exciting update this week with Azure Backup. Now you can directly backup your Azure VMs to Azure Backup vaults easily. This is something that customers were asking for sometime. Let’s take a look at what are the considerations you are going to take into account if you are using this new feature.

  • Backup with no impact to production workloads
  • You do not need to shutdown the VMs
  • Provides application level consistency for Windows operating systems
  • Provides file system level consistency for Linux Operating systems

Backup Procedure

  • Create a backup vault in the same region as your VMs. Currently this feature supports within a single region. But I expect them to make it a geo-enabled feature as keeping the backup in the same data center seems little odd.Azure VM Backup 1
  • Discover the VMs that you need to backup first. For that expand the backup vault > Registered Items > Click DiscoverAzure VM Backup 2

Azure VM Backup 3

  • The next step is to register your VMs in the backup vault. Click the Register button as in the above picture. Keep in mind the VM should be running for the registration to be successfully completed.Azure VM Backup 4
  • Once registration is done click Protect to start protection. Here you need to select the VMs that you need to backup and create a backup policy for the same. You can select a backup frequency as well as a retention range that suits your backup requirement.Azure VM Backup 5

Azure VM Backup 6

  • Remember you can add only one backup policy per VM. Also the maximum retention period is 30 days and you only have backup time slots that are predefined with 30 minute intervals.

Performing a Backup

If you want to perform an adhoc backup out of the backup policy in the Protected Items tab of the backup vault select Backup Now. You can even stop protecting the VM by clicking Stop Protection icon.Azure VM Backup 7

Restore from a backup

  • Go to the Protected Items tab and click Restore. This opens the Restore an Item wizard.Azure VM Backup 10
  • In the Select a recovery point page you can select a restore point from available list of restore points.Azure VM Backup 11
  • In the Select restore instance page you need to specify where you want to restore the VM. This is an alternate location with new VM name, can be a different cloud service and a different Virtual Network. It’s up to you to select those parameters but you might need a new cloud service and a new network if you want to test the back up isolated first.Azure VM Backup 12

Monitor Backup Progress

You can monitor the backup progress in the Jobs page. This is important as you may need to know if a backup operation has failed or server registration has failed.Azure VM Backup 8

If I drill down through my existing adhoc backup I can see the task sequence there.

Azure VM Backup 9As you can see the word PREVIEW in this service (some pages) I wouldn’t be doing this on production but it’s still worth a try.

 

Trick or Treat | Protecting large VHDs with Azure Site Recovery

One of the most annoying problems with ASR is that it cannot protect VMs with VHDs which has capacity greater 1 TB. Now we know the OS drive limitation has been already addressed by Microsoft (refer my previous blog post) but still 1 TB cap is there for data disks. This  is a limitation in Azure itself as of now. Let’s see a workaround that we can leverage to overcome this barrier.

Solution

This solution involves creating new striped disk which consists of the creation of a new striped disk drive consisting multiple smaller VHD images less than 1 TB each. Here we are copying the data from the old VHD to the new striped volume and the remove the old VHD.

Prerequisites

    • VM should be in shutdown state.
    • Required number of 1 TB VHDs should be added to the VM that can accommodate the size of the VHD which is greater than 1 TB. Keep in mind these VHDs are dynamic not fixed.

Procedure

  • Start the VM and stop any application services that are running i.e SQL
  • Go to Computer Management > Disk management tab. If prompted to initialize the new VHDs click OK and proceed.
  • Right-click one of the new unallocated volumes, and then click Create striped volume.
  • Select all the new volumes that are displayed in the wizard to create a striped volume.
  • Assign a temporary drive letter (i.e, F:) to the new drive and format the new drive to NTFS
  • Now you can copy the data between the two drives. Use below robocopy command to do so.
robocopy E:\ F:\ /mir
  • Change the drive letter of the new disk to the drive letter of the old disk. (Swap the drive letters)
  • In a PowerShell window (as administrator) run diskpart.
  • Type SAN POLICY=OnlineAll.
  • Shut down the VM and remove the old VHD image from the VM.
  • Start the VM and the services that you’ve stopped earlier, and try to protect the VM with Site Recovery now.

Azure VM OS disk limitation lifted

Whenever I engage in an Azure IaaS project one of the first questions that my customers ask is why Azure VMs can’t have more than 127 GB in OS Disk? That is a very difficult moment for me when I pitch Azure as a platform to my customer so I asked myself why on earth they can’t lift the ban on OS disk?

This week I bring some good news. Now you can have OS disks up to 1 TB of size in your Azure VMs. Well probably nobody is going to need that much of beef. But the whole idea behind 127 GB cap was to discourage people from using their C:\ drive to store production workloads. Yes to stop keeping everything in one book, simple as that. This rule of thumb still remains same but keep in mind when you want to store more persistent data inside an Azure VM always use a data disk as OS disk is caching optimized for OS performance.

How do I create a 1 TB OS disk?

The answer is no you don’t. Because when you create a VM from the gallery you don’t have such an option. This limit increase applies for those VMs that you are going to migrate from on-premise environment or custom VHDs (for templates) that you upload. If you play close attention to below you can see all my VMs created from gallery/marketplace still has the 127 GB limit imposed. So basically this applies for your migration workloads.

127gbvhd

This limit increase only applies for your own VMs. VMs used for cloud service roles such as web/worker roles still have the 127 GB limit in OS disk as these are Microsoft managed instances.

If you are planning your DR environment or production hybrid cloud with Azure VMs you no longer need to worry about the OS disk size issue as this update has already addressed that. (But it’s still 1 TB so try not to pass that limit on OS drive)

Installing Squared Up v2 for SCOM

What I hear most of the time from my customers is that Operations Manager is a great product but it’s too complex to analyze the monitoring data for their IT administrators. A lot of third party partner companies of Microsoft have developed custom applications that can take care of this dilemma. Squared up is such an effort and I must say it’s really great add-on for SCOM.

Squared Up provides web based HTML5 dashboards by leveraging Operations Manager Databases. For organizations who prefer good old nagios or PRTG looking monitoring; where everything is visual Squared Up is ideal. But guess what behind all those nice visualizations in Squared Up you can really drill down to the end and investigate the issues in your environment just like in SCOM console.

Squared Up is what Operations Manager is badly missing: a modern, amazingly fast, multi-platform web console. It is a must-have for every Operations Manager deployment.” – Daniele Grandini, Microsoft MVP System Center Cloud and Data Center Management

Prerequisites for Squared Up web server

Operating System

  • Windows Server 2008 R2 SP1 or later
  • Windows Server 2012
  • Windows Server 2012 R2

.NET

  • .NET Framework 4.0 or later
  • IIS installed

Co-hosting

  • Can be installed on Operations Manager management server or root management server
  • Can be installed on the same web server as the Operations Manager web consoles (2007 R2 or 2012)

Installation Procedure

  1. Download the binaries from the Square Up website. You can request a fully functional 30 day trial from here.
  2. Launch the installer.SquaredUp Installation 1
  3. Accept the license agreement.SquaredUp Installation 2
  4. Review the installation options. If you need to change the default settings you can run the installer through command prompt. There are number of support articles in Square Up support portal that you can refer for if you ever need that.SquaredUp Installation 3SquaredUp Installation 4
  5. In the next screen you need to provide your Root Management Server’s FQDN and click Next to finish the setup. SquaredUp Installation 5 SquaredUp Installation 6
  6. Open a web browser and go to the Squared Up URL (See above). You will be prompted to enter your credentials that you use for SCOM. After doing so provide the license key you got from Squared Up team and viola you are done.SquaredUp Installation 7 SquaredUp Installation 8If you want a live demo to convince your boss to purchase Squared Up here is the free demo provided by Squared Up team.

Health Service Issues in SCOM Management Servers | Part 2

In a previous post I explained the root causes related to alerts raised regarding SCOM health service in Management Servers or Agent Managed computers. Today let’s see how this affects a fresh installation of SCOM environment.

Issue

Out of the box SCOM 2012 R2 deployment without any agents or MPs deployed is raising critical alerts for Management Server Health Service as below.

Health Service Critical (3)This is fixed when we recalculate or the health for the unhealthy management server but this alert appears more frequently.

Root Cause

The default value of 10000 in Health Service Handle Count Threshold monitor is rather too small for the management server. This is a known issue in SCOM 2012 R2 which is fixed in Update Rollup 3. The default value is set to 30000 in UR3. But what if the environment is fresh or has a lower UR installed? Let’s see how we can work around that.

Fix

You can override the Health Service Handle Count Threshold monitor for the specific management server to a higher value.

Health Service Critical (1)

Health Service Critical (2)

Health Service Critical (4)

I tried several higher value but the ideal value for this is 30000 if you are not in UR3. You can save this override anywhere you want but I do recommend you to create an override MP for SCOM separately.

Health Service Critical (5)You’ll have to perform this in all the Management Servers if they are listed as critical. Anyhow it is always better to have the latest URs installed if you want a more greener SCOM environment.

Microsoft Azure Virtual Machine Optimization Assessment Tool

Have you ever wondered to find a better way to contact a doctor for you Azure IaaS setup? If you are not already aware Microsoft Azure Virtual Machine Optimization Assessment Tool is a great tool that take care of what Azure PFEs would do for you. This tool focuses on 6 key areas on Azure VMs running AD, SQL or SharePoint.

  1. Security and compliance
  2. Availability and business continuity
  3. Performance and scalability
  4. Upgrade, migration and deployment
  5. Operations and monitoring
  6. Change and configuration management

This tool will present a short short questionnaire, and then generate automated data analysis for your target VMs and produce custom reports on collected data with further recommendations.

What is required?

  • Operating system should be anyone out of Windows Vista, Windows 7, Windows 8, Windows 8.1, or Windows Server 2008, Windows Server 2008 R2, Windows Server 2012, Windows Server 2012 R2 for the machine that is running the tool
  • Hardware Specification Minimum: 4GB RAM, 2 GHz dual-core processor, 5 GB of free disk space.
  • The server or the PC which hosts the tool should be joined to one of a domain of the AD forest in which the target VMs are part of. 2. Additional software requirements.
  • Microsoft .NET Framework 4.0
  • Windows PowerShell 2.0 or later

To download and check out this tool visit here.

Missing Alerts & Health Service Failures in SCOM

Everybody wants to see a greener SCOM console where nothing is in critical status. Sometimes even though your infrastructure is functioning properly you may still see critical health status and missing alerts in your SCOM. Let’s look at two different scenarios like this and try to troubleshoot each.

My SCOM Management Server or Monitored Servers are healthy. But why do I see them in Critical status now and then even there are no unhealthy child monitors?

MOM Agent’s communication channel between an agent and a management server is maintained by Health Service. This tries to connect to a MS every 60 seconds. By default if 3 heartbeats are missed consecutively the health service monitor turns into Critical status where MS assumes that there is a connection failure between the agent and MS. This is if you have a single MS but if you have a secondary MS it will try to connect to that one after the third try.

The culprit here is corrupted cache in Health Service Monitor. Sometimes even though the Health Service is OK, if the cache is messy it will still display the RED alert. This can happen if your MS or agents are generating lot of alerts.

Clearing Health Service Cache

This is the final troubleshooting step that you can get before you uninstall the MOM agent from a monitored server. Interesting thing is this is kind a like GHOST PROTOCOL (yes the Tom Cruz movie) where after you perform this task it won’t have a task status of that since after the cache is cleared there won’t be any record of that. In simple terms this is where you HIT RESET in the agent and it does the following.

  1. Stops the System Center Management service.
  2. Deletes the health service store files.
  3. Resets the state of the agent, including all rules, monitors, outgoing data, and cached management packs.
  4. Starts the System Center Management service.

Clearing the health service cache is pretty straight forward.

Clear Health State 1For management server you have to drill down the Operations Manager MP > Management Server > Management Server State > Under Tasks select Flush Health Service State and Cache task.

Clear Health State 2

Missing Alerts in SCOM Console

Sometimes you may have encountered an error like “An object of type MonitoringAlert with id xxx was not found” when you click on an alert in SCOM console. This is because of a corrupted SCOM console cache. The alerts may be already resolved but here in the console they might be still present.

To clear the SCOM Console cache type below command in a Run window and press enter.

“C:\Program Files\System Center Operations Manager 2012 R2\Console\Microsoft.EnterpriseManagement.Monitoring.Console.exe” /clearcache

If you have an older version of SCOM this path may be different. You can get the actual path by right clicking the SCOM console shortcut and copying the path.

Clear Missing Alerts

This command will delete the momcache.mdb found in user’s appdata folder which contains the user display preferences for each user that uses SCOM console.

Protect your Private Cloud with 5Nine Cloud Security

When it comes to virtualization lot of people start asking questions about how they can secure their environment against security threats. Installing an AV solution inside individual VMs looks like the correct answer but what will happen in case of a network related security threat? Let’s explore the best answer for these issues in Hyper-V context.

5nine Cloud Security is an agentless security solution for Hyper-V which uses the extensible Hyper-V switch capabilities. This solution is capable of providing VM isolation, compliance and antivirus features.

5Nine also offers firewall, AV & IDS functions out of the box. The most important thing about this is it is an agent;less solution where you do not install any agent inside VMs to achieve these goals.

For hosters using Windows Azure Pack 5Nine offers Azure Pack extension which allows them to bring true IDS capabilities to their tenants. As the number of tenants increase security becomes the number one concern of any hoster. Not only that the 5Nine Cloud Security SCVMM plugin let you to deploy all these features via SCVMM if you are only focused about managing your own environment through SCVMM, making it easier to integrate both solutions.

All these features come at an attractive price $199/2 CPUs per host. If you are interested you can visit www.5nine.com for more information. Below is a short demonstration of what 5Nine Cloud Security can do to protect your Hyper-V Hosts, Private Cloud or Service Provider Cloud.

In a future post I’m going to discuss how to configure 5Nine Cloud Security to protect your Microsoft virtualization solution.

Error 31552 & Unhealthy SCOM 2012 R2 Management Server Issue

I have noticed a strange issue in a recent SCOM deployment that I’ve been working on and was trying to figure out what is actually wrong for ages. In this case in one of the management servers the health is at critical status, no unhealthy child monitors when I checked in Health Explorer for that particular management server.

But when I investigated the event log, I have noticed below error has been logged frequently.

Error 31552After a little research I found out the culprit was Exchange 2010 Reporting MP and the root cause was data aggregation for Exchange 2010 not working as intended. So here is what I did to rectify the issue.

Important

Make sure you backup both OperationsManager & OperationsManagerDW databases prior performing below tasks. Of course we are executing a stored procedure in SQL against OperationsManagerDW database but it’s better to backup both databases in case if they fall out of sync as they are highly interdependent.

Delete the  Microsoft.Exchange.2010.Reports MP

Deleting the MP will resolve the issue temporary but if you want a permanent fix you need to execute a stored procedure as stated below. After that you have to stop the SCOM services on all Management Servers before proceeding.

Execute the Stored Procedure to against OperationsManagerDW database.

After completing step 1 please leave everything about 15 minutes to settle down and then run the below SP.

USE OperationsManagerDW
DECLARE @DatasetId uniqueidentifier
SELECT @DatasetId = DatasetId FROM Dataset WHERE DatasetDefaultName = ‘Microsoft.Exchange.2010.Reports.Dataset.TenantMapping
EXEC StandardDatasetDelete
@DatasetId = @DatasetId

Pay attention to the highlighted Dataset Name. This can be a different one in your case. Make sure you type in the correct Dataset name. You may restart the SCOM servcies now.

Import the MP Again (Optional)

This is up to you. If you want to re-import the MP give it couple days to rest and then import or you can just leave it.