Azure Data Lake Storage ACL Automation

In my last blog entry, we covered how to layout folders in your Data Lake Storage account based on a logical design.  That’s only half the battle.  You also need to set up Access Control Lists (ACLs).  Setting up controls via the Azure portal is easy, but not something you can automate. Today, we’ll jump back into PowerShell and set up access controls based on our logical design.

Before you can start using PowerShell to interact with ADLS, you’re going to need to install the Azure PowerShell modules.  Once you have, you’ll want to make sure you have access to the Data Lake Storage component.  Run the following command in your script:

Install-Module -Name AzureRM.DataLakeStore -Scope CurrentUser -force

You’re also going to need to authenticate to Azure.

Login-AzureRmAccount

Then Connect to your Data Lake Storage account.  $adls is the name of my storage account, and $rg is the name of the resource group that holds my storage account.

Get-AzureRmDataLakeStoreAccount -Name $adls -ResourceGroupName $rg;

Next, let’s set up two variables. One will hold the email address of the user we’re granting permissions to.  The second will hold the list of folders and permissions.  This second variable, $ACLs is a two-dimensional array.  If you’re not familiar with two-dimensional arrays, think of them as a two column table. Add a row for each folder you want to grant permissions on.

$useremail = "test@test.com";

$ACLs = @( `
     ("/raw/test", "Write") `
   , ("/development","Read")
);

Now that you have these, you can use a for each loop to set your permissions.

foreach ($ACL in $ACLs) {
   write-host "Grant $useremail " $ACL[1] " access to " $ACL[0];
    Set-AzureRmDataLakeStoreItemAclEntry -AccountName $adls -Path $ACL[0] -AceType User -Id $(Get-AzureRmADUser -Mail $useremail ).Id -Permissions $ACL[1]
    Set-AzureRmDataLakeStoreItemAclEntry -AccountName $adls -Path $ACL[0] -AceType User -Id $(Get-AzureRmADUser -Mail $useremail ).Id -Permissions $ACL[1] -Default
}

Now, for each permission, we’ll set the ACL and the default.  Why set both?  Well, when folders are created under each of the target folders, you want to cascade those permissions down from parent to child, right?  Well, that’s what the Default ACL controls.  If you skip the second Set-AzureRMDataLakeStoreItemAclEntry, then new folders would not inherit the permissions of the containing folder and your users would be unable to access their files properly.

Next Time

Now that we have our Data Lake Storage set up, we can start loading up some files to those raw folders.  Next time, I’ll share with you some simple U-SQL scripts to scrub and transform this data into more useful data.  After that, I’ll share my secrets to automatically generating these scripts based on metadata!

1 Comment on "Azure Data Lake Storage ACL Automation"


Leave a Reply

Your email address will not be published. Required fields are marked *