For the WordTutor application to work, we need to be able to read words (and letters) out loud to our student. To power the speech synthesis, we’re going to integrate Azure Cognitive Services into the application.
Azure Speech API
Setting up access to the Speech API within Azure Cognitive Services is relatively straightforward. Rather than repeating the details here, I’ll just point you to the quickstart. I’m using the free tier (“F0”), allowing for up to 5 hours of speech rendering per month.
When you’re finished creating your service, take note of the region you used (mine was
australiaeast) and one of the API keys generated for you.
We don’t want to embed any secrets directly in the application, nor commit them to our git repository.
Fortunately, we can easily move those secrets out of the application, using a couple of NuGet packages:
Microsoft.Extensions.Configurationprovides basic infrastructure; and
Microsoft.Extensions.Configuration.UserSecretsallows us to keep our secrets outside the project directory during development.
There are other packages available as well, allowing configuration to be stored in other places.
With those packages installed, we can use the
dotnet command to store our secrets.
For the entry-point project of our application we need a one-time initialization:
If you’re doing this for yourself, don’t make the mistake I just showed above! Secrets configuration needs to be done for the entry point of the project, so I had to redo the above step for the
Once completed, you’ll find a
<UserSecretsId> element has been added to the
.csproj file of your project, in the first
To store our two secrets, we again use the
These secrets are stored in your user profile on this PC. Navigate to the folder
%USERPROFLE%\AppData\Roaming\Microsoft\UserSecrets to find a folder with the name matching the
<UserSecretsId> from above; inside there is
usersecrets.json, containing your secrets. It’s worth emphasizing that there’s no encryption here; the goal is to keep the secrets out of your git repo, not to hide them from you.
We need to declare a service interface for our application, representing the speech service in a technology-agnostic way:
Using this interface will allow us to wire up a fake service for testing purposes, allowing us to verify correct behaviour without actually calling into the Azure implementation and incurring costs.
Our primary implementation of
ISpeechService will be
AzureSpeechService. To keep our dependencies properly isolated, this lives in a new project
WordTutor.Azure. Anything else Azure related we choose to add in the future will live here too.
IConfigurationRoot parameter for the constructor is how we retrieve the user secrets we stashed away earlier. In full, the constructor looks like this:
To make our
ISpeechService available for consumption, we need to register it with our dependency injection container.
We also need a singleton registration for
IConfigurationRoot. To build this we need to first build it:
The generic parameter
Program provided to the
AddUserSecrets() call is used to identify the entry-point assembly. The
<UserSecretsId> element from the
csproj file turns into an assembly level attribute, allowing the running application to find the secrets it needs.
Finally, for demo purposes, we can inject
ISpeechService into our main window, hook it up to a button and make everything work.
Having (literally!) achieved “Hello World”, we need to make a number of improvements. Most notably, we currently have a non-trivial lag before speech begins - plus we’re re-rendering the same text as audio every time we want to speak. Some caching - and some pre-caching - seems to be in order.