With Speech Synthesis Markup Language (SSML) you can enhance your TTS prompts. For example, you can include pause within a prompt or change the speech rate or pitch.
Example
{
"prompt": "<speak>Hi! I will wait for 2 seconds. <break time='2s'/> Now I will spell the word hello: <break time='1s'/> <say-as interpret-as='characters'>hello</say-as>. <prosody rate='-50%' pitch='-25%' volume='+2dB'>I am speaking a bit loudly, slowly and with a lower pitch.</prosody></speak>"
}
Attributes
Type | Description | Parameters | Values |
---|---|---|---|
The root where the SSML start. Required in order to use SSML. | -- | -- | |
Adds a pause between words. | time time in milliseconds (ms) or seconds(s) | Ns or Nms Maximum total duration is 10 seconds per API call. | |
Control how special types of words are spoken. | interpret-as Determines how to say certain characters, words, numbers | cardinal, characters, ordinal, fraction, unit, date, time, expletive | |
Control the volume, speaking rate and the pitch | rate Determines how fast or slow the text should be spoken. pitch Determines the pitch of the voice. volume Change the volume for the text. | rate +/- n% pitch +/- N% volume +/- NdB | |
Emphasize words. | level Specify the degree of emphasis | strong, moderate, reduced |
More advanced information about SSML and the attributes can be found in the W3C SSML Specification.