Alexa, MHUB + uControl = all your AV, voice-controlled

Amazon Alexa and HDANYWHERE MHUB 4K
Execute commands that actually mean something. HDANYWHERE set to release their Amazon Alexa skills.

 

With the UK’s largest CI exhibition (EI Live!) just around the corner, I thought I’d take some time to blog about what’s been going on inside HDA; our progress with Amazon’s Alexa service, and our plans for its imminent rollout to all active MHUB PRO units globally come early June 2017.

 

The HDA brand mission is to change people’s relationship with home entertainment, improve the audiovisual experience and make centralised whole-home AV more mainstream. We believe a big factor in achieving this ambition will be the ability to use our voice/speech to simply and easily control all our TVs and what we watch on them.

 

I’m going to start with some preamble: embrace voice because this is no fad.

Feel free to skip this and go to why we haven’t launched any skills yet if you want the shorter version!

 

Unlike 3D TV and curved screens, voice interaction is not going to go away. All the major technology companies have their own speech recognition and language engines and they have been refining (and measuring the growth) of this technology via their mobile or consumer hardware. The trend is clear: year-on-year exponential growth in this new UI type.

 

Voice User Interfacing/Interaction (VUI) is going to be the next BIG change in the way we interact with computers. and as home automation/technology specialists you should be thinking about offering or adding this skill set to your armoury of services, or selecting manufacturers that natively support this method of control… nudge nudge, wink wink 😉

 

 

As long as the big players don’t do something stupid like this we can expect to see more voice controlled devices appearing in homes

 

A great example to illustrate what kind of impact voice could have is by looking at the original launch of Apple’s iPhone in 2007. To me, the feature which stood out most was its radical approach to user interfacing. Apple ditched a keyboard, the numeric pad and navigation buttons opting to use a variety of environmental sensors to detect touch, rotation, tilt, proximity and acceleration. With access to all this input data it became incredibly easy to create instantly intuitive interfaces which required almost no training with the beauty in that by simply using the device you were training yourself.

 

That was the genius of Apple’s design. People instantly got it and it has subsequently become the de facto interface method for mobile telephony devices.…and with VUI, I believe, we’re at that point in time just before the release of the original iPhone.

 

From HDA’s perspective, it makes complete sense to R&D voice interfacing and we started provisional work on it as far back as May/June 2016. In Oct 2016, at our X Event in Manchester, we announced that our new hardware would feature functionality that no other brand offered, and that we would unveil new ways to enable your matrix to do much more within a home than ever before. We introduced the world to our vision for smarter, neater AV with the release of MHUB PRO and at ISE this year we showed off our work-in-progress Alexa skills; demonstrating switching between sources and controlling source device content to impressed audiences.

 

In the meantime, Amazon have been busy too; running national TV campaigns promoting home automation using their Echo and Dot devices.

 

… So why haven’t we ‘turned on’ Alexa voice control yet?

Well, I don’t think Alexa is ready for full home automation.

 

It’s clear to me that the Echo and Dot were not designed to serve the home automation market but rather act as a portal to Amazon services or other partners (like ordering a pizza or taxi). Just like gesture alone wasn’t enough on iPhone, Amazon Alexa requires a number of additional tricks added to it in order to serve a home properly. The following is not supported currently:

  1. Voice recognition
    to have some idea who is uttering a command, mum, dad, kids… or a stranger
  2. Voice proximity
    neither Echo or Dot can tell what device I am addressing from proximity or volume. If I shout “Alexa” and there is a Dot in one room and another in the room adjacent, they will both wake and await response. The only way around this is to create dedicated ‘wake words’ for each room, but that’s dumb and unnatural IMO.
  3. Zone context
    there is currently no way to manage where in your home your Alexa device is situated, meaning you have to announce the room you’re in to control devices in that room
  4. Limited support for actual Smarthome functions in API
    this has meant that HDA have had to write custom skills to get the best results. The scope is pretty much limited to turning something on or off and changing your thermostat temperature, more on that later!

 

We wanted to do better than Amazon’s limited and unnatural smarthome skills.

What we didn’t want to do was create a skill that would require you to “turn on cooking” or turn on my “welcome” understanding that, has got us to where we are today. Take a look (or listen) at Control4’s idea of a voice operated smarthome…

 

Most VUI implementations I have seen almost exclusively utilise the limited smarthome skills meaning that you have to turn on your shades instead of open them, or my favourite, “turn on cooking”. It doesn’t sound natural.

 

Here’s our stab at voice controlling your TV with Alexa and HDANYWHERE

We have been pulling our hair out trying to make voice commands short, memorable and intelligent.

 

Notice how most Alexa implementations require you to declare the room they’re in? …ours doesn’t. …And the work required to make that happen is the second reason we’ve not been so quick to release.

 

To me, personally, it is the difference between making a voice command that people might actually use vs a command that might be used because it is new and novel and then ditched after the honeymoon period is over because it is unintuitive and executes very simple commands.

 

A design mantra given to our devs was that if you could execute the command faster by picking up a remote or using uControl then we should not do it.

 

We have focused on commands you are likely to want to say rather than use a remote for and we have lost weeks of time researching how to optimise our voice commands so that they are concise and can execute complex actions (with the help of a uControl Sequence) that would take time to do using a remote control or even multiple remotes!

 

The basic syntax (voice structure) for a command is as follows:

 

  • “Alexa” wakes the Echo or Dot and makes it listen carefully to whatever is said next.
  • “tell my TV to” tells Alexa to target specific HDA functions
  • then a pre-defined command (like increase volume, play, watch) which MHUB understands
  • followed by joining word to make a gramatical sentance (in brackets () and are optional).
  • finishing with a variable to make the command actionable in real time (in square brackets [] and can be anything from a [room name], volume up [25] times, [“BBC”] = 101, [“ITV”] = 103)

 

We have one smarthome skill and that will turn on your TV in any default configuration you want (any channel, any volume, any source, any room), simply say:

 

Alexa, turn on TV/[or any word of your choosing]

 

Whilst watching TV you can say:

 

Alexa, tell my TV to:

-- Play
-- Pause
-- Record/Stop Recording
-- Increase volume/volume up (by) [any number value]
-- Decrease volume/drop volume (by) [any number value]
-- Switch to [your source devices]
-- Go to [your favourite channel name]
-- Watch [any word]
-- Resume in [any room name]

 

It’s so exciting to see it work in real time. I know everyone at HDA really wants to show this off at EI Live! At the show we will be editing what these phrases do in real time via our Pro Remote Management Portal on our HDA Cloud Service, pushing that update to our demo MHUBs at the show and then executing that change immediately using Alexa!

 

Releasing 1st week of June

We are preparing to submit all our skills for Amazon to approve in the next week so that we can rollout support in stages completing by the 1st week in June.

 

Neither Chris, nor I believe our implementation is perfect and there is plenty of work to do, but after release, our development teams will be waiting to tweak code and improve on the natural language flow. Our APIs will get faster, our utterances shorter, with MHUBs actions more intelligent.

 

Alexa; this is just the beginning.

 

Dillan Pattni
(April 27, 2017)