Title: Speech Part 1 - How to Add "Text to Speech" (Speech Synthesis) to your Delphi Apps
Question: How can I get my application to read text?
Answer:
How to Add Speech Synthesis (a.k.a Text to Speech) to your Delphi Apps.
On Aug 11, 2001 Microsoft released the SAPI 5.1 SDK. This is significant because SAPI 5.1 is fully automated. That is you can use it from any language that supports OLE automation. These are not Active X controls and can be either early or late bound.
In this article Im going to show you how to get and install the SAPI 5.1 SDK. Then Im going to show how to use the SDK convert text to synthesized speech in a Delphi application. The synthesized speech is played over you computers speakers. I test this in Delphi 5 and 6.
To get SAPI 5.1 you need to go to Microsofts Speech.net Technologies web site at http://www.microsoft.com/speech/ and follow the link to the download. Right next to the download link is the release notes link. READ THE RELEASE NOTE! Especially if your development machine is using a default language other than US English.
If you are running a beta version of the XP operating system you might have some problems. This is because SAPI 5.1 is built into XP and the most recent public beta of XP as of this writing (RC 2) includes an earlier version of SAPI 5.1. Dont try to install the release version of SAPI 5.1 into XP, it will not work.
Once you read the release notes follow the link to the Speech SDK 5.1 Download page. In most cases all you need to download is the link labeled Speech SDK 5.1 (68 MB). This contains the SDK, the documentation and the free Microsoft English text to speech and speech recognition engines. The download is very large, 68 MB, so unless you have a high speed connection to the internet you might want to order the SDK CD from Microsoft.
. Time passes while you download or wait for the postman .
Ok, now you have the SAPI 5.1 SDK. Run the speechsdk51.exe to install it on your development system.
[ *** DELPHI 6 Users IMPORTANT ****
There is a bug in the type library import in Delphi 6 see article 2589. This sample will still work with the unit created by the type libary import in Delphi 6 but only because none of the events for the component are used. If you want to use any of the SPVoice events you will need to read article 2589 ]
What you need to do now is make Delphi aware of the new SAPI automation objects. To do this, start up Delphi 5 or 6 (I didnt try earlier versions) and go to Project | Import Type Library. In the Import Type Library dialog highlight Microsoft Speech Object Library (Version 5.1). If you dont find this in the list then somethings wrong with the installation of SAPI 5.1.
Delphi is going to want to put the SAPI components on your ActiveX palette page. I recommend you put these on a new palette page called SAPI 5 since the number of components installed is large (19). You may also want to choose a Unit dir name of something other than the default. Make sure the Generate Component Wrapper check box is checked and press the Install
In the Install dialog choose the Into new package tab and in the File name: field give a package name like SAPI5.dpk press the browse button and make sure the dpk is created in the same directory where you created the components. Actually this isnt completely necessary it just helps keep things together. In the Install dialogs Description field give some meaningful description like SAPI 5 automation components. Press OK
Press yes in the confirm dialog and the new components will be created and installed.
If you now look in the directory you specified for the components you should find SpeechLib_TLB.pas (and dcr) which contains all the component code as well as interface, const, type and other useful information. This is your most valuable piece of documentation on the SDK. Ive found it even better than the Microsoft SAPI 5.1 documentation which is pretty good. This directory should also contain (if you followed the above instructions) the SAPI5.dpk which is your package source.
If you go to the far eastern end of your component palette you should find the new SAPI5 palette page with its 19 speech components.
Now for the fun part.
Lets make an application that can synthesize speech. In Delphi start a new application and drop a button on the form. On the SAPI5 palette page find the SpVoice component and drop it on the form. On my machine this component is the 5th one reading from left to right.
Now create an onClick event for you button that looks something like this;
procedure TForm1.Button1Click(Sender: TObject);
begin
SpVoice1.Speak('Hello world!', SVSFDefault);
end;
Run the program and press the button. Cool hu?
At this level its amazingly simple. The SPVoice objects Speak method is very powerful. This power comes from the second parameter. For the above example I choose to use the default mode which causes the speak method to return only when the synthesis is complete, not to purge pending speech requests, to respond to special XML control tags embedded in the text.
The SDKs documentation is contained in sapi.chm which you will find in the \Program Files\Microsoft Speech SDK 5.1\Docs\Help directory.
Sapi.chm contains a lot of information. To go directly to the meat of the subject go to the last folder on the outlines 1st level titled Automation and go down to SPVoice and then to the Speak method read whats there and also be sure to follow the link to the SpeechVoiceSpeakFlags info. You will find that in addition to just speaking passed in text that can also do much more some of the more interesting flags are;
Pass in a file name and speak the text in the file. (SVSFIsFilename)
Make the function either return immediately (asynchronously) or only after the synthesis is complete(synchronously). If you speak asynchronously there are events available to fire when the speech is done. (SVSFlagsAsync)
Embed flags in the text that can control various aspects of the synthesis like pitch, rate, emphasis, and much more (see the included White Paper titled XML TTS Tutorial). I found this feature a bit addicting as I attempted to make the synthesized voice sing.( SVSFIsXML)
One interesting thing I found (but not documented) was that you can speak a web sites title by setting the flag to SVSFIsFilenam and passing a URL. If you are connected to the internet, try replacing the speak line in the sample line with
SpVoice1.Speak('http://www.o2a.com', SVSFIsFilename);
And run it.
Even more bizarre is you can use the speak method to play wav files. Try
SpVoice1.Speak('C:\WINNT\MEDIA\Windows Logon Sound.wav', SVSFIsFilename);
Theres a lot more to SAPI then text to speech and theres more to text to speech then what Ive covered here. Hopefully this will be the first of a number of articles on SAPI but Ill only do them if youre interested so please be sure to comment. Also Im completely open to suggestions on what youd like to see next (if anything at all).
If you want to talk privately Im at alecb@o2a.com.