With Speech Synthesis Markup Language (SSML) you can enhance your TTS prompts. For example, you can include pause within a prompt or change the speech rate or pitch.

Example

{
    "prompt": "<speak>Hi! I will wait for 2 seconds. <break time='2s'/> Now I will spell the word hello: <break time='1s'/> <say-as interpret-as='characters'>hello</say-as>. <prosody rate='-50%' pitch='-25%' volume='+2dB'>I am speaking a bit loudly, slowly and with a lower pitch.</prosody></speak>"
}

Attributes

Description	Parameters	Values
The root where the SSML start. Required in order to use SSML.	--	--
Adds a pause between words.	time time in milliseconds (ms) or seconds(s)	Ns or Nms Maximum total duration is 10 seconds per API call.
Control how special types of words are spoken.	interpret-as Determines how to say certain characters, words, numbers	cardinal, characters, ordinal, fraction, unit, date, time, expletive
Control the volume, speaking rate and the pitch	rate Determines how fast or slow the text should be spoken. pitch Determines the pitch of the voice. volume Change the volume for the text.	rate +/- n% pitch +/- N% volume +/- NdB
Emphasize words.	level Specify the degree of emphasis	strong, moderate, reduced

More advanced information about SSML and the attributes can be found in the W3C SSML Specification.