When I received the first pre-order email for an Amazon Echo in late 2014, I was excited by this new technology. I pre-ordered right away. The Echo became the first of many devices that tied into the Alexa Voice Service (AVS). This voice service encapsulates both speech recognition as well as Natural Language Understanding features. These features are separate products in the Cortana Intelligence Suite. Combining them into one service makes things easier, though it does limit some flexibility in solutions.
Since this first version, Amazon has released more devices that tie into the service, as well as opening up that service to interact with third-party hardware manufacturers. You can even build your own devices to tie into AVS with a RaspberryPi, a mic, and speaker. One of the areas I’m most interested in seeing Alexa grow is in my car. The idea that I could interact with this intelligent assistant during long car rides and work through to do items, plan my next blog series, or work out scheduling conflicts without taking my eyes off the road is very interesting.
When my Echo arrived Early January 2015, I ran through all the demos and even started using the reminder feature fairly regularly. The first problem I had to address: I had two other reminder lists going already. I had a list of reminders on my phone and I had a list in outlook. If you know me at all, you know I have to write something down if you expect me to get it done. Expecting me to check multiple reminder lists was not realistic. I was able to develop a workaround using ifttt.com.
I created a recipe that would take my Alexa reminders and send them to my phone. I also used iCloud on my windows machine to keep my Outlook reminders in sync with my phone. My precious. One reminder list to rule them all!
Unfortunately, the novelty wore off after a few months and my Echo became little more than a Bluetooth speaker I could turn on and off with my voice. Then Amazon released the developer program Alexa Skills Kit (ASK) allowing anyone with programming skills to develop skills for the Alexa service. I was excited once more…until I realized that most of the development had to be built in Node.Js. After trying to wrap my head around javascript once more, I tapped out.
During this early attempt to develop a new skill I found there were four categories of skills you could develop.
Smart Home Skills
You can build skills to interact with smart devices. One of the first smart devices I added to my home was the Nest Thermostat. Nest labs developed a skill that allows me to set my temperature, or run the fan in my HVAC system without reaching for my phone, web browser, or heaven forbid stand up and walk to the thermostat on the wall. We’ll talk more about infrastructure later, but know this for smart home skills, you have to have a cloud accessible endpoint that both the Alexa Voice Service can call and your smart device can reach, in order for your smart home device to respond to your Alexa request.
Flash Briefing
This skill is little more than an RSS reader. When users enable the skill, users can request a news briefing, and it finds the latest item in the RSS feed. If the feed content is text, then Alexa will read that content to you (up to 8000 characters). If the item is an mp3, Alexa can play that file for your users.
Video Skill
While this type of skill wasn’t available until April 2017, It’s worth mentioning here. This skill is a little more than the video equivalent of a flash briefing. The Alexa service can be made aware of the video services your users subscribe to. With this information, users can then request video content by name, and the Alexa service finds the content and then directs the stream to a given video-capable device such as a Fire Tablet, Fire TV, Echo Show, or Echo Spot. Siri has very similar features on the AppleTV as of the Fall 2017 update.
Custom Skill
Custom skills hold the most promise for adding real intelligence and productivity via Voice User Interface (VUI) in my opinion. Custom skills only constrain your solution in terms of how users will start using the skill: Users must start the interaction with a request. The Alexa Skills Kit then passes off the request to a web endpoint the developer chooses in order to craft a response. There’s very little limiting what you could do with this endpoint.
That’s all I really got from my attempts to develop a skill until August when I had the good fortune to attend Dave Mattingly’s session “Alexa, Talk to Me” at SQL Saturday Indy. Dave re-ignited my desire to build additional skills for the Alexa service. He shared the new educational materials Amazon had developed to help me learn enough Node.Js to publish my first skill Invader Zim Quotes. It was a very basic skill, you ask for a quote from the Invader Zim show, and Alexa would speak the quote back to you.
After publishing this first skill, I found the most important piece to this puzzle. I found a blog entry from Tim Heuer: “Write your Amazon Alexa Skill using C# on AWS Lambda services“. Now, I can use C# to develop skills. I rewrote my Invader Zim skill in C# and pushed out the second version in less than two hours. After that, I added some sophistication to the skill. You can now ask Invader Zim, what would a specific character say, and the quote will be a quote that your requested character spoke in an episode.
Since that first skill, I’ve prototyped several more skills, as well as pushed a few to Amazon for release. In my next blog entry, I’ll cover a few more key terms you’ll need to understand before you get started working with Alexa. In the meantime, if you have questions, please send them in. I’m here to help!