Last week you might have seen a tweet about a day in the life of a Data Platform Consultant. To say the least, my days are varied.This day, in particular, I was split between building out automated ETL tests using Biml and spinning up a new Azure Data Lake. Up until recently, I would have…Read More Data Warehouse Automation is for Data Lakes too!
As I mentioned in the overview, the largest cost in terms of time is the requesting each web page from the web server and downloading that file to disc. That’s why this file staging loop is a separate step from the parsing and transforming step. This loop can be as simple as two steps: get…Read More The File Staging Loop
In the last article, I laid out the architecture to dealing with this type of data source. This time, we’re going to get in to the basics of parsing data. A Few New Technologies Before we dive into details, let’s cover the technologies this solution rests on. First, there’s C#. As a Microsoft data professional,…Read More Parsing and Extracting Web Data