You hear time and time again how 50 to 80 Percent of Data science projects is spent on data wrangling munging and transformation of raw data into something usable. For me personally. I’ve automated a lot of those steps. I built tools over the last 20 years that help me do more in less time….
Author: slowder
NOAA Radar and Severe Weather Data Inventory
After I finished evaluating the Storm Events Database from NOAA, I was convinced we needed to look for machine recorded events. When you start poking around the NOAA site looking for radar data, you’ll find a lot of information about how they record this data in binary block format. Within this data, you’ll find measurements…
Data Quaity Issues
This entry picks up the story behind my first data science project predicting hail damage to farms. In this article we identify data quality issues in our first data source. Property and Crop Damage In the NOAA documentation these two columns were recorded to say how much property and crop damage occurred in a given…
Data Science Project 1: Predicting Hail Damages
Early on in my new role I was asked to find out how risky it was to offer hail insurance for a given property. If you haven’t worked with insurance before here’s the basics. You’re placing a bet that says I’m betting something bad is going to happen. The insurer is betting that it won’t…
Becoming a Data Scientist
In my latest professional change I’ve moved into a role with a bright and shiny title: Data Scientist. I’ve heard all the hype around Data Science, Machine Learning and Artificial Intelligence, and I don’t buy a lot of it. I do believe these words will change how we do everything. I just don’t know that…
Data Analysis…can we automate this?
As some of you know, I’ve moved from consulting back into a full-time employee for Crop Pro Insurance. There was so much opportunity in this role. First of all, this role gives me my first full-time data science credit. I also get to build a team to support data science projects. On top of that,…
Metadata Model Update
As I began learning Biml, I developed my original metadata model to help automate as much of my BI development as I could. This model still works today, but as I work with more file based solutions in Azure Data Lakes, and some “Big Data” solutions, I’m discovering it’s limitations. Today I’d like to talk…
Data Warehouse Efficiency
How quickly do you get from the business coming to you with “we need a data warehouse” to delivering that warehouse? If you’ve sat through any Biml talk, you’ve undoubtedly heard stories of thousands of staging packages being generated per hour. You may have even heard tales of source systems being analyzed in hours, rather than…
Alexa Telemetry Data
When you build an Alexa skill you’ve got two rich sources of information describing how users are interacting with your skill. The first is the data in the Alexa Skills Kit request object. The request object is the JSON representation of what you said to your Alexa device. Alexa Voice Service(AVS) takes your spoken words…
Building our Alexa Skill Function
In putting together the demo for this blog, I found there is an issue when saving data to the Alexa Session data. I can’t reliably write to this collection in the request/response JSON. Without this ability, I’d have to write a session manager myself. Honestly, I don’t have time for that. I’ve reached out to…