When you get started with Azure Data Lake Analytics and U-SQL specifically, you may get a little confused. It looks like a mash-up of T-SQL and C#. Turns out, That’s exactly what it is! You can find lots of information on MSDN, or GitHub, or StackOverflow. Let’s get started with some basics. Variables, DataTypes, and Case…
Azure Data Lake Storage ACL Automation
In my last blog entry, we covered how to layout folders in your Data Lake Storage account based on a logical design. That’s only half the battle. You also need to set up Access Control Lists (ACLs). Setting up controls via the Azure portal is easy, but not something you can automate. Today, we’ll jump…
Azure Data Lake Storage Zone Layout Automation
After laying out the structure for our zones, my client quickly asked, is there a way we can automatically stand up this structure each time we bring a new Tenant into our solution. With a smile, I replied it was possible! In the short term, we would use PowerShell to stand up these folders every…
Azure Data Lake, Step by Step
Over the next few blog posts in this series, I’m going to share with you the story of how a Data Lake project comes together. As I tell this story, I’m going to keep pointing back to traditional ETL work and to Automation techniques. Not all of these will include Biml. I hope to help…
Data Warehouse Automation is for Data Lakes too!
Last week you might have seen a tweet about a day in the life of a Data Platform Consultant. To say the least, my days are varied.This day, in particular, I was split between building out automated ETL tests using Biml and spinning up a new Azure Data Lake. Up until recently, I would have…
The File Staging Loop
As I mentioned in the overview, the largest cost in terms of time is the requesting each web page from the web server and downloading that file to disc. That’s why this file staging loop is a separate step from the parsing and transforming step. This loop can be as simple as two steps: get…
Parsing and Extracting Web Data
In the last article, I laid out the architecture to dealing with this type of data source. This time, we’re going to get in to the basics of parsing data. A Few New Technologies Before we dive into details, let’s cover the technologies this solution rests on. First, there’s C#. As a Microsoft data professional,…
Web-based data sources and ETL
Introduction There’s a ton of data on the Internet. Some of that data is really easy to download and extract that data into a relational database. Data that comes in CSV, JSON or maybe even a database backup. Unfortunately, not all of that is as easy to get at. What we do in the only…
Biml and Oracle Connections
Everyone who knows me, knows I’m a Microsoft data platform professional. I do prefer their solutions over most other solutions. I know there have been a few times when I’ve shocked my coworkers when I’ve suggested Couch, Redis or some other NoSQL solution as a solution to a particular problem. The one that you’ve never heard me…
SQL Azure and Azure Active Directory: Part Two
After getting AD Password Authentication working with my Azure SQL server, I moved on and set up a local Active Directory domain on my home network. I don’t think this is something every data professional should try and tackle. I count myself extremely fortunate to have made friends with IT professionals outside the data space, and…