For those of us who are confused by techno jargon here is a really simplified explaination that may help you to understand how the Microsoft Speech process works and what all these different files are for.
The two components that work together to make your computer talk are the "Speech Application Interface" (SAPI) and the "Text To Speech Engine" (TTS Engine). The SAPI is a gatekeeper between the application (eg, Sayz Me) and the TTS Engine. The TTS Engine is the voice (eg, Sam).
From within the Program you select the voice to use, the voice settings (pitch, volume, etc) and enter the text to be spoken. The Program then passes the request onto the SAPI. The SAPI then selects the correct TTS Engine, tells the engine what settings to use and forwards the text to be converted into sound. The TTS Engine then does its magic and outputs the sound to the sound card (speakers or headphones) or to a file (mp3).
Text In (Program) -> SAPI -> TTS Engine -> Sound Out
There are two SAPI versions that are widely used - SAPI4 and SAPI5. SAPI4 is the older version used by Windows 98, NT and 2000. SAPI5 is the current version used by Windows XP which has a few more features. Note that both versions of SAPI can coexist on the same machine.
Each TTS Engine (eg, Sam) may support one or both of the SAPI versions. Likewise, each Program (eg, Sayz Me) may support one or both of the SAPI versions. So when you download a TTS Engine (Voice) you will need to make sure the it is compatible with the SAPI version that your Program supports. Sayz Me only supports SAPI4.
The following text-to-speech engines are licensed only for use in Microsoft Agent enabled applications and Web pages with a visibly displayed Microsoft Agent character. You may also need to install the Agent Language Component for non english languages. Please visit the Microsoft Agent download page for end-users website for further details.