clip_image002
True speech interaction is something we don't see very often in gaming. Sure, you have recorded audio for NPCs and a few non-silent protagonists, but it's very rare for the player to interact with speech themselves. It's certainly a missed opportunity for games and applications where a little human interaction could do with a boost to make it stand out. With Windows 10, Cortana has been released from the confines of the Windows Phone platform to all UWP (Universal) projects, putting our sultry temptress in the hands of all consumers whether they use a phone, desktop or tablet, even on an Xbox One in the future. Cortana gives us the best personal assistant money can buy, but the system also offers a very extensive speech system that is available to us humble developers. Using the power of Cortana, we can bring the full power of speech to our games, providing such features as:A full text to speech system, complete with custom grammar.
Robust voice-to-text or voice recognition system.
Voice commands, allowing you to perform actions and start your game using voice.
Integrate with the notification center for alerts and notices.
There's a lot more, but this is the crux of what article is going to cover. I wrote an original article for implementing Cortana and Unity with Windows Phone 8.1 which you can read here (http://bit.ly/1OIMVoY). This article is all new and improved, updated for Windows 10 and a few extra bells and whistles. The sample project built using Unity version 5.3 accompanying this post can be found here: http://wp.me/a3o0M2-2lG
Unity and Windows 10
With the introduction of Windows 10 / UWP support we can make full use of these platform features, however, because Unity doesn't have native support for speech or Cortana, we need to build our own bridge between the two platforms. Thankfully since Unity is .NET based we don't need to build a plugin to do it (as is required for other platforms such as Android or iOS), we simply need to build an interop bridge between Unity and the UWP platform to give bidirectional access to all of these (and any other) capabilities we need. Here's what the layout of the full implementation looks like:
clip_image004
Figure 1: Unity to Windows Interop architectureIn Unity we need to declare an Interop class to manage the boundary between Unity and the Windows UWP platform. Any data that your Unity project needs from Windows, it will get from this Interop class and any action that needs to be kicked off on the Windows platform will be performed by the Interop class, as shown here:
clip_image006
Figure 2: Unity Interop interface definitionThis pattern affords a couple of benefits, most notably we know exactly what data we need from the platform and what actions we can perform all in one place (defined as a Contract between Unity and the platforms it's deployed on), regardless of which script or scene uses them. Once we have the behaviors we want in our game, we apply this to our project on the platform. So after you have built the project, we need to fulfil the Contract we have created using the platform specific elements, for example:
clip_image008
Figure 3: Windows 10 Interop implementationSo from this Interop pattern we ensure that each party (Unity and Windows) know what they are doing, what data they control and at what time.
Enough talk, show me the code
So now I've showed you how all of this is built up, let's jump in to some example code with a few examples. The examples comprise of:
Using voice commands to launch your game.
Get your game to speak to the player.
Use speech in your game for commands or dictation.
Use the notification center to toast your player.
Enough spiel, on with the demos!
You want me to do what?
For the first example, I want to show you is how to take advantage of voice commands. This gives your players the ability to launch into a specific area of your game or app, or to provide additional information when launching the game. Some examples could be:
Launching your game via voice.
Starting your game at a certain screen, such as "Game, enter random battle".
Providing additional prompts when starting your game, such as "Game, find player X".
A wrath of options, just from using your voice. Best of all, your game does not even have to be running for these commands to be available; the player simply holds the search button or launches Cortana speech to enable Cortana, then speak their command. Hey presto, your game/app launches! For the purposes of the example, you can either add this to an existing project or create a new one. The downloadable example has a prepared scene for testing.
Create your Interop class
Let's start with our all-important Unity Interop class, In Unity, create a new C# script called “CortanaInterop” and replace its contents with the following:
using System; using UnityEngine; public class CortanaInterop : MonoBehaviour { }
In this class we will add a static property. This property will be used to parse the text of the command that was used to launch the application in to your Unity project, as follows:
public static string CortanaText;
You will notice that the string property is marked as Static. This ensures that it is the only instance of this property in use in your Unity project, it also allows the Windows project to locate it easily. That's all for the Unity project for now. Build your project for the Windows Store platform and select the Windows 10 / XAML target, as shown here:
*Note, Selecting the XAML version is important as we need it to access the media components of the Windows 10 UWP platform.
clip_image010
Figure 4: Unity build settings for Windows 10 Universal apps
Before you build the solution, you also need to enable a few capabilities for the project. Once you have configured your build target as above, click on “Player Settings” and then under the “Publishing Settings” section in the Capabilities area, check the InternetClient and Microphone capabilities. Now click build and off you go.
clip_image012
Figure 5: Unity Windows 10 Player settings
Once the project is built, open the exported projec