Use PowerShell to back up your files to an Azure Storage Blob

I was browsing the Microsoft Technet forums last week and came across a question if there’s a way to back up files and folders to an Azure Storage Blob by using PowerShell. I know that Microsoft introduced Azure Site Recovery (ASR) and Azure Backup together with the Azure Backup Agent (MARS) (more information on the Microsoft site) to achieve exactly this functionality.

But thinking further, I thought this could be a nice opportunity to create such a script and get some more knowledge about writing to Azure Storage using PowerShell. So this is exactly what I did: create a script which can create a backup of your files on Azure Blob Storage. This script will check either the last write time of the file, or the MD5 hash of the content (depending on the passed parameters), and copies the files to Azure which are either newer, or have a different MD5 hash. In this article I’ll describe how the script works and what the challenges were when creating the script.

The PowerShell script I created is available on the Microsoft Technet Gallery: https://gallery.technet.microsoft.com/Back-up-files-to-Azure-b9e863d0

Storage Account

Before using the script, you should create a storage account in Microsoft Azure. So open the portal, and add a storage account with the following properties:

  • Deployment model: Resource Manager
  • Account kind: Blob storage
  • Performance: Standard
  • Replication: LRS
  • Access tier: Cool

The other properties don’t really matter, just create it in the region you want, in any Resource Group and give it a name that is available. When using the above settings, you will create a cost-optimized storage account (cool data tier and LRS replication is cheaper than eg. hot data and GRS replication). If you’d like the data replicated across different regions or data centers, you can select GRS or ZRS replication.

Once created, you’ll need to update the script and enter the correct storage account name and storage account access key. The access key can be found in the properties blade of the storage account. You can either use the primary or secondary key.

The script

Checking if a file has changed

Now for the script itself. First of all, I needed to think of a way to determine if a local file has changed. The actual file modification time when copying the file to Azure is lost. The only timestamp you’ll be able to find on the Blob is the time the Blob itself was modified on Azure. Meaning that if you upload a file that has been changed 2 days ago, it will store the date and time when it was uploaded or overwritten. This means that I’ll need to store the modification date of the local file in Azure. Luckily, Azure allows you to store metadata on a per-blob basis, which allows me to store the modification time of the local file in Azure. How to read and write metadata in Azure blobs is explained in the next chapter.

The other option on checking file changes, is to use the MD5 hash of the file. When uploading the file to Azure, the MD5 hash of the file content is stored automatically. You can see the Content MD5 property when viewing the Blob properties in the Azure portal:

Content-MD5
Content MD5 on the Azure Portal

To compare the MD5, the MD5 of the local file needs to be calculated. This can be done using a few lines of code:

Function Get-MD5Hash {
    Param (
        [Parameter(Mandatory=$true)][String]$Path
    )

    If (Test-Path -Path $Path) {
        try {
            # Create the hasher and get the content
            $crypto = [System.Security.Cryptography.MD5]::Create()
            $content = Get-Content -Path $Path -Encoding byte
            $hash = [System.Convert]::ToBase64String($crypto.ComputeHash($content))
        } catch {
            $hash = $null
        }
    } Else {
        # File doesn't exist, can't calculate hash
        $hash = $null   
    }
    
    # Return the Base64 encoded MD5 hash
    return $hash
}

This function returns the MD5 hash of the file, which can be compared to the MD5 hash on Azure. If those two value differ, you’ll know that the file has been changed (either locally or on Azure). How to get the Content MD5 on Azure side is described in the next chapter.

Blob MetaData and Properties

Retrieving Blob metadata and properties is not possible using a “simple” PowerShell cmdlet. What you need to do is retrieve the Blob object using the “Get-AzureStorageBlob” cmdlet and creating a “CloudBlockBlob” object out of it. This can be done using these commands (I’m omitting the blob name, container name and storage context):

$azblob = Get-AzureStorageBlob -Blob $blobname -Container $Container -Context $context
$cloudblob = [Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob]$azblob.ICloudBlob

Once the object is create, you can use the $cloudblob variable to access both the metadata and the properties. The metadata is accessible using the MetaData property of the variable, while the MD5 content is available in the Properties.ContentMD5 property (which will only be available after you run the FetchAttributes() command:

$cloudblob.MetaData | Format-Table

$cloudblob.FetchAttributes()
$cloudblob.Properties | Format-Table
Write-Host $cloudblob.Properties.ContentMD5

These properties can then be compared to the values of the local files. If the values differ, the file can be overwritten on Azure.

Containers

Creating and checking container availability can be done using the “Get-AzureStorageContainer” cmdlet. The downside of this cmdlet is that error handling is not very good. If you want nice user-friendly error messages (like I always want in my scripts!), you’ll need to both have the “-ErrorAction SilentlyContinue” parameter set together with a “try {} catch {}” block. In PowerShell you can check if your last command returned an error by checking the built-in “$?” variable, if this variable is false, the command returned an error. Next, you can retrieve $Error[0] (built-in variable as well) to get the exact error message.

The “Get-AzureStorageContainer” cmdlet can throw an error for various reasons:

  • The container name is not correct (the container name should contain only lowercase characters, can only contains letter, numbers and dash, more information here)
  • There is no internet connection
  • The storage account name is not correct
  • The storage account access key is not correct

This code executes the cmdlet and checks which error was generated:

try {
    $azcontainer = Get-AzureStorageContainer -Name $Container -Context $context -ErrorAction SilentlyContinue
} catch {}

If ($? -eq $false) {
    # Something went wrong, check the last error message
    If ($Error[0] -like "*Can not find the container*") {
        # Container doesn't exist, create a new one
        Write-Host -Value "Container `"$Container`" does not exist, trying to create container" -ForegroundColor Yellow
        $azcontainer = New-AzureStorageContainer -Name $Container -Context $context -ErrorAction SilentlyContinue

        If ($azcontainer -eq $null) {
            # Couldn't create container
            Write-Host -Value "ERROR: could not create container `"$Container`"" -ForegroundColor Red
            return
        } Else {
            # OK, container created
            Write-Host -Value "Container `"$Container`" successfully created" -ForegroundColor Yellow
        }
    } ElseIf ($Error[0] -like "*Container name * is invalid*") {
        # Container name is invalid
        Write-Host -Value "ERROR: container name `"$Container`" is invalid" -ForegroundColor Red
    } ElseIf ($Error[0] -like "*(403) Forbidden*") {
        # Storage Account key incorrect
        Write-Host -Value "ERROR: could not connect to Azure storage, please check the Azure Storage Account key" -ForegroundColor Red
        return
    } ElseIf ($Error[0] -like "*(503) Server Unavailable*") {
        # Storage Account name incorrect
        Write-Host -Value "ERROR: could not connect to Azure storage, please check the Azure Storage Account name" -ForegroundColor Red
        return
    } ElseIf ($Error[0] -like "*Please connect to internet*") {
        # No internet connection
        Write-Host -Value "ERROR: no internet connection found, please connect to the internet" -ForegroundColor Red
        return
    }
}

It will output a user-friendly error message.

Uploading the blob

Before uploading the blob, the full path of the local file needs to be converted to a naming convention which is allowed by Azure Blob storage. The only “folder” blob storage supports is the container. Everything after the container is considered part of the blob name. So if you want to “emulate” a folder structure in Azure Blob storage, you can use a forward slash (/). So a file “C:\Data\Documents\My Important Document.docx” could be converted to “C/Data/Documents/My Important Document.docx”. The nice thing about the Azure Portal is that if you store a Blob like this, the portal will simulate the folder structure itself (allowing you to traverse through the different folders).

Once the blob name is generated, the file can be uploaded using the “Set-AzureStorageBlobContent” cmdlet. This cmdlet allows you to pass a hashtable to the -MetaData parameter, which will be the set as MetaData on the Azure Blob. Because it’s possible the Blob already exists, you can define the “-Force” parameter to force overwriting of the existing Blob:

$output = Set-AzureStorageBlobContent -File $file.FullName -Blob $blobname -Container $Container -Context $context -Metadata @{"lastwritetime" = $file.LastWriteTimeUTC.Ticks} -Force

Note the -MetaData parameter; it takes a hashtable as value, which can be defined in a single line. If you want multiple Metadata properties to be added, you can split the key-value pairs with semicolon (;):

Set-AzureStorageBlobContent -File $file.FullName -Blob $blobname -Container $Container -Context $context -Metadata @{"property1" = "value1"; "property2" = "value2"; "property3" = "value3"}

MARS Agent

As stated before, Azure already offers different services to backup your files and folders into Azure. One of these is Azure Site Recovery services, which uses a locally installed agent, called the “MARS” agent. So if you’re looking into a light-weight way to backup files to Azure, you can leverage my script. However, if you’re looking for a more rugged and flexible way to backup data (including retention, etc.), take look at the “Back up Windows Server files and folders” article on the Microsoft site.

I hope this article was useful for you. If you have any questions, please don’t hesitate to leave a comment or contact me over email.

3 thoughts to “Use PowerShell to back up your files to an Azure Storage Blob”

  1. Hi,
    I really liked this script; I came across it while looking to do something slightly different, but it is still really useful.

    I have one question: Why did you choose to write your own function to get the MD5 hash instead of using the Get-FileHash commandlet?

    Cheerio,
    Lars

    1. Hi Lars,

      thanks for your comment; I built my own function because it’s simply faster. Get-FileHash will load the entire file into memory which could take a while with large files.

      1. Ahhh, that makes sense; now I’m curious, and think I will have to test it against Get-Filehash. I frequently use that functionality.
        Thanks again.

Leave a Reply

Your email address will not be published. Required fields are marked *

Complete the following sum: * Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.