Responses

This document describes the different types of UIs that you can present to your users when building actions that can have audio or visual components or a combination of both.

Simple Responses

Audio (TTS/SSML)

Howdy, this is GeekNum. I can tell you fun facts about almost any number, my favorite is 42. What number do you have in mind?

Visual

Simple responses can appear on audio-only, screen-only, or both surfaces. They take the form of a chat bubble visually, and TTS/SSML sound.

TTS text will be used as chat bubble content by default. So if this looks fine, you do not need to specify any display text for a chat bubble.

Requirements
  • Supported on actions.capability.AUDIO_OUTPUT and actions.capability.SCREEN_OUTPUT surfaces
  • 640 character limit per chat bubble. Strings longer than the limit are truncated at the first word break (or whitespace) before 640 characters.

  • Chat bubble content must be a phonetic subset or a complete transcript of the TTS/SSML output. This helps users map out what you are saying and increases comprehension in various conditions.

  • At most 2 chat bubbles per turn

  • Chat head (logo) that you submit to Google must be 192x192 pixels and cannot be animated

Sample code

Node.js
function simpleResponse (app) {
  app.ask({speech: 'Howdy! I can tell you fun facts about ' +
            'almost any number, like 42. What do you have in mind?',
            displayText: 'Howdy! I can tell you fun facts about almost any ' +
                  'number. What do you have in mind?'})
}
JSON
{
    "conversationToken": "{\"state\":null,\"data\":{}}",
    "expectUserResponse": true,
    "expectedInputs": [
        {
            "inputPrompt": {
                "richInitialPrompt": {
                    "items": [
                        {
                            "simpleResponse": {
                                "textToSpeech": "Howdy! I can tell you fun facts about almost any number, like 42. What do you have in mind?",
                                "displayText": "Howdy! I can tell you fun facts about almost any number. What do you have in mind?"
                            }
                        }
                    ],
                    "suggestions": []
                }
            },
            "possibleIntents": [
                {
                    "intent": "actions.intent.TEXT"
                }
            ]
        }
    ]
}

SSML and sounds

Using SSML and sounds in your apps gives them more polish and enhances the user experience. The following code snippet shows you how to create a response that uses SSML:

Node.js
function saySSML(app) {
      let text_to_speech = '<speak>'
        + 'Here are <say-as interpret-as="characters">SSML</say-as> samples. '
        + 'I can pause <break time="3" />. '
        + 'I can play a sound <audio src="https://www.example.com/MY_WAVE_FILE.wav">your wave file</audio>. '
        + 'I can speak in cardinals. Your position is <say-as interpret-as="cardinal">10</say-as> in line. '
        + 'Or I can speak in ordinals. You are <say-as interpret-as="ordinal">10</say-as> in line. '
        + 'Or I can even speak in digits. Your position in line is <say-as interpret-as="digits">10</say-as>. '
        + 'I can also substitute phrases, like the <sub alias="World Wide Web Consortium">W3C</sub>. '
        + 'Finally, I can speak a paragraph with two sentences. '
        + '<p><s>This is sentence one.</s><s>This is sentence two.</s></p>'
        + '</speak>'
      app.tell(text_to_speech);
    };;
JSON
"agentToAssistantJson": {
    "expectUserResponse": false,
    "finalResponse": {
        "speechResponse": {
            "ssml": "<speak>Here are <say-as interpret-as=\"characters\">SSML</say-as> samples. I can pause <break time=\"3\"/>. I can play a sound <audio src=\"https://www.example.com/MY_WAVE_FILE.wav\">your wave file</audio>. I can speak in cardinals. Your position is <say-as interpret-as=\"cardinal\">10</say-as> in line. Or I can speak in ordinals. You are <say-as interpret-as=\"ordinal\">10</say-as> in line. Or I can even speak in digits. Your position in line is <say-as interpret-as=\"digits\">10</say-as>. I can also substitute phrases, like the <sub alias=\"World Wide Web Consortium\">W3C</sub>. Finally, I can speak a paragraph with two sentences. <p><s>This is sentence one.</s><s>This is sentence two.</s></p></speak>"
        }
    }
}

See the SSML reference documentation for more information.

Sound library

We provide a variety of free, short sounds in our sound library. These sounds are hosted for you, so all you need to do is include them in your SSML.

Rich Responses

Rich responses can appear on screen-only or audio and screen experiences. They can contain the following components:

  • One or two simple responses (chat bubbles)
  • An optional basic card
  • Optional suggestion chips
  • An optional link-out chip
  • An option interface (list or carousel)
Requirements
  • Supported on actions.capability.SCREEN_OUTPUT surfaces
  • The first item in a rich response must be a simple response
  • At most two simple responses
  • At most one basic card, option interface (list or carousel), or StructuredResponse (i.e. You cannot have both a basic card and an option interface at the same time)
  • At most 8 suggestion chips
  • Suggestion chips are not allowed in a FinalResponse

The following examples show you how to build various types of rich responses.

Basic Card

A basic card displays information that can include the following:

  • Image
  • Title
  • Sub-title
  • Text body
  • Link button

Use basic cards mainly for display purposes. They are designed to be concise, to present key (or summary) information to users, and to allow users to learn more if you choose (using a weblink).

In most situations, you should add suggestion chips below the cards to continue or pivot the conversation.

Avoid repeating the information presented in the card in the chat bubble at all costs.

Requirements
  • Supported on actions.capability.SCREEN_OUTPUT surfaces
  • Formatted text (required if there's no image)
    • Plain text by default
    • Must not contain a link
    • 10 line limit with an image, 15 line limit without an image. This is about 500 (with image) or 750 (without image) characters. Smaller screen phones will also truncate text earlier than larger screen phones. If text contains too many lines, it's truncated at the last word break with an ellipses.
    • A limited subset of markdown is supported:
      • New line with a double space
      • **bold**
      • *italics*
  • Image (required if there's no formatted text)
    • All images forced to be 192 dp tall
    • If the image's aspect ratio is different than the screen, the image is centered with gray bars on either vertical or horizontal edges.
    • Image source is a URL
    • Motion GIFs are allowed
Optional
  • Title
    • Plain text
    • Fixed font and size
    • At most one line; extra characters are truncated
    • The card height collapses if no title is specified.
  • Sub-title

    • Plain text
    • Fixed font and font size
    • At most one line; extra characters are truncated
    • The card height collapses if no subtitle is specified
  • Link button

    • Link title is required
    • At most one link
    • Links to sites outside the the developer's domain are allowed.
    • Link text cannot be misleading. This is checked in the approval process.
    • A basic card has no interaction capabilities without a link. Tapping on the link sends the user to the link, while the main body of the card remains inactive.

Sample code

Node.js
function basicCard (app) {
  app.ask(app.buildRichResponse()
    // Create a basic card and add it to the rich response

    .addSimpleResponse('Math and prime numbers it is!')
    .addBasicCard(app.buildBasicCard(`42 is an even composite number. It 
      is composed of three distinct prime numbers multiplied together. It 
      has a total of eight divisors. 42 is an abundant number, because the 
      sum of its proper divisors 54 is greater than itself. To count from 
      1 to 42 would take you about twenty-one…`)
      .setTitle('Math & prime numbers')
      .addButton('Read more', 'https://example.google.com/mathandprimes')
      .setImage('https://example.google.com/42.png', 'Image alternate text')
    )
  );
}
JSON
{
    "conversationToken": "{\"state\":null,\"data\":{}}",
    "expectUserResponse": true,
    "expectedInputs": [
        {
            "inputPrompt": {
                "richInitialPrompt": {
                    "items": [
                        {
                            "simpleResponse": {
                                "textToSpeech": "Math and prime numbers it is!"
                            }
                        },
                        {
                            "basicCard": {
                                "title": "Math & prime numbers",
                                "formattedText": "42 is an even composite number. It \n      is composed of three distinct prime numbers multiplied together. It \n      has a total of eight divisors. 42 is an abundant number, because the \n      sum of its proper divisors 54 is greater than itself. To count from \n      1 to 42 would take you about twenty-one…",
                                "image": {
                                    "url": "https://www.google.com/search?q=42",
                                    "accessibilityText": "Image alternate text"
                                },
                                "buttons": []
                            }
                        }
                    ],
                    "suggestions": []
                }
            },
            "possibleIntents": [
                {
                    "intent": "actions.intent.TEXT"
                }
            ]
        }
    ]
}

List Selector

The single-select list presents the user with a vertical list of multiple items and allows the user to select a single one. Selecting an item from the list generates a user query (chat bubble) containing the title of the list item.

Requirements
  • Supported on actions.capability.SCREEN_OUTPUT surfaces
  • List Title (optional)
    • Fixed font and font size
    • Restricted to a single line. (Excessive characters will be truncated.)
    • Plain text, Markdown is not supported.
    • The card height collapses if no title is specified.
  • List item

    • Title
      • Fixed font and font size
      • Max length: 1 line (truncated with ellipses…)
      • Required to be unique (to support voice selection)
    • Body Text (optional)
      • Fixed font and font size
      • Max length: 2 lines (truncated with ellipses…)
    • Image (optional)
      • Size: 48x48 px
  • Pagination

    • The pagination control appears under two conditions
      • Simple list: If > 5 items
      • Lists with body text or image: If >3 items
    • 30 item max
  • Interaction
    • Voice/Text
      • The user can always say or type an item's Title instead of tapping it.
    • Swipe
      • If the number of items in the list is great enough to make the pagination control appear, then swiping left/right reveals different list items
Guidance

Lists are good for when it's important to compare options (e.g which "Peter", do you need to speak to? Peter Jons, or Peter Hans?), or the user needs to choose between options that need to be scanned at a glance.

We recommend adding suggestion chips below a list to enable the user to pivot or expand the conversation. Never repeat the options presented in the list as suggestion chips. Chips in this context are use to pivot the conversation (not for choice selection).

Notice that in the example presented here, the chat bubble that accompanies the list card is a subset of the audio (TTS/SSML). The audio (TTS/SSML) here integrates the first listed item. We discourage reading all the elements from the list. It's best to mention the top item/s (for example, the most popular, the recently purchased, or the most talked about).

Sample code

Node.js
function list (app) {
  app.askWithList(app.buildRichResponse()
    .addSimpleResponse('Alright')
    .addSuggestions(
      ['Basic Card', 'List', 'Carousel', 'Suggestions']),
    // Build a list
    app.buildList('Things to learn about')
    // Add the first item to the list
    .addItems(app.buildOptionItem('MATH_AND_PRIME',
      ['math', 'math and prime', 'prime numbers', 'prime'])
      .setTitle('Math & prime numbers')
      .setDescription('42 is an abundant number because the sum of its ' +
        'proper divisors 54 is greater…')
      .setImage('http://example.com/math_and_prime.jpg', 'Math & prime numbers'))
    // Add the second item to the list
    .addItems(app.buildOptionItem('EGYPT',
      ['religion', 'egpyt', 'ancient egyptian'])
      .setTitle('Ancient Egyptian religion')
      .setDescription('42 gods who ruled on the fate of the dead in the ' +
        'afterworld. Throughout the under…')
      .setImage('http://example.com/egypt', 'Egypt')
    )
    // Add third item to the list
    .addItems(app.buildOptionItem('RECIPES',
      ['recipes', 'recipe', '42 recipes'])
      .setTitle('42 recipes with 42 ingredients')
      .setDescription('Here\'s a beautifully simple recipe that\'s full ' +
        'of flavor! All you need is some ginger and…')
      .setImage('http://example.com/recipe', 'Recipe')
    )
  );
}
JSON
{
    "conversationToken": "{\"state\":null,\"data\":{}}",
    "expectUserResponse": true,
    "expectedInputs": [
        {
            "inputPrompt": {
                "richInitialPrompt": {
                    "items": [
                        {
                            "simpleResponse": {
                                "textToSpeech": "Alright! Here are a few things you can learn. Which sounds interesting?"
                            }
                        }
                    ],
                    "suggestions": [
                        {
                            "title": "Basic Card"
                        },
                        {
                            "title": "List"
                        },
                        {
                            "title": "Carousel"
                        },
                        {
                            "title": "Suggestions"
                        }
                    ]
                }
            },
            "possibleIntents": [
                {
                    "intent": "actions.intent.OPTION",
                    "inputValueData": {
                        "@type": "type.googleapis.com/google.actions.v2.OptionValueSpec",
                        "listSelect": {
                            "title": "Things to learn about",
                            "items": [
                                {
                                    "optionInfo": {
                                        "key": "MATH_AND_PRIME",
                                        "synonyms": [
                                            "math",
                                            "math and prime",
                                            "prime numbers",
                                            "prime"
                                        ]
                                    },
                                    "title": "Math & prime numbers",
                                    "description": "42 is an abundant number because the sum of its proper divisors 54 is greater…",
                                    "image": {
                                        "url": "http://example.com/math_and_prime.jpg",
                                        "accessibilityText": "Math & prime numbers"
                                    }
                                },
                                {
                                    "optionInfo": {
                                        "key": "EGYPT",
                                        "synonyms": [
                                            "religion",
                                            "egpyt",
                                            "ancient egyptian"
                                        ]
                                    },
                                    "title": "Ancient Egyptian religion",
                                    "description": "42 gods who ruled on the fate of the dead in the afterworld. Throughout the under…",
                                    "image": {
                                        "url": "http://example.com/egypt",
                                        "accessibilityText": "Egypt"
                                    }
                                },
                                {
                                    "optionInfo": {
                                        "key": "RECIPES",
                                        "synonyms": [
                                            "recipes",
                                            "recipe",
                                            "42 recipes"
                                        ]
                                    },
                                    "title": "42 recipes with 42 ingredients",
                                    "description": "Here's a beautifully simple recipe that's full of flavor! All you need is some ginger and…",
                                    "image": {
                                        "url": "http://example.com/recipe",
                                        "accessibilityText": "Recipe"
                                    }
                                }
                            ]
                        }
                    }
                }
            ]
        }
    ]
}

Handling a selected item

When users select an item, the selected item value is passed to you as an argument. You can use the client library to read the value by calling app.getContextArgument(). In the returned value, you will get the key identifier for the selected item:

function itemSelected (app) {
  // Get the user's selection
  const param = app.getContextArgument('actions_intent_option',
    'OPTION').value;

  // Compare the user's selections to each of the item's keys
  if (!param) {
    app.ask('You did not select any item from the list or carousel');
  } else if (param === 'MATH_AND_PRIME') {
    app.ask('42 is an abundant number because the sum of its…');
  } else if (param === 'EGYPT') {
    app.ask('42 gods who ruled on the fate of the dead in the ');
  } else if (param === 'RECIPES') {
    app.ask('Here\'s a beautifully simple recipe that\'s full ');
  } else {
    app.ask('You selected an unknown item from the list or carousel');
  }
}

The carousel scrolls horizontally and allows for selecting one item. Compared to the list selector, it has large tiles-allowing for richer content. The tiles that make up a carousel are similar to the Basic card with image. Selecting an item from the carousel will simply generate a chat bubble as the response just like with list selector.

While they are visually compelling, carousels are limited in their utility in a multimodal interface. This is because they are hard to interact with as a voice interface (for that, we favor lists). Refer to the guidelines section of the carousels to learn more.

Requirements
  • Supported on actions.capability.SCREEN_OUTPUT surfaces
  • Carousel
    • Max # tiles: 10
    • Min # tiles: 2
    • Plain text, Markdown is not supported.
  • Carousel tile
    • Image (optional)
      • Image is forced to be 128 dp tall x 232 dp wide
      • If the image aspect ratio doesn't match the image bounding box, then the image is centered with bars on either side
      • If an image link is broken then a placeholder image is used instead
    • Title (required)
      • Same as the Basic Text Card
      • Titles must be unique (to support voice selection)
    • Body Text (optional)
      • Same formatting options as the Basic Text Card
      • Max 4 lines
  • Interaction
    • Swipe left/right: Slide the carousel to reveal different cards.
    • Tap card: Tapping an item simply generates a chat bubble with the same text as the element title.
    • Voice/Keyboard: Replying with the card title (if specified) functions the same as selecting that item.
Guidance

Carousels are good when various options are presented to the user, but a direct comparison is not required among them (versus lists). In general, lists are prefered over carousels simply because lists are easier to visually scan and interact with via voice.

We recommend adding suggestion chips below a carousel to enable the user to pivot or expand the conversation. Never repeat the options presented in the list as suggestion chips. Chips in this context are use to pivot the conversation (not for choice selection).

Same as with lists, the chat bubble that accompanies the carousel card is a subset of the audio (TTS/SSML). The audio (TTS/SSML) here integrates the first tile in the carousel, and we also strongly discourage reading all the elements from the carousel. It's best to mention the first item and the reason why it's there (e.g. the most popular, the most recently purchased, the most talked about, etc.).

Sample code

Node.js
function carousel (app) {
  app.askWithCarousel(app.buildRichResponse()
    .addSimpleResponse('Alright! Here are a few things you can learn. Which sounds interesting?')
    .addSuggestions(
      ['Basic Card', 'List', 'Carousel', 'Suggestions']),
    // Build a carousel
    app.buildCarousel()
    // Add the first item to the carousel
    .addItems(app.buildOptionItem('MATH_AND_PRIME',
      ['math', 'math and prime', 'prime numbers', 'prime'])
      .setTitle('Math & prime numbers')
      .setDescription('42 is an abundant number because the sum of its ' +
        'proper divisors 54 is greater…')
      .setImage('http://example.com/math_and_prime.jpg', 'Math & prime numbers'))
    // Add the second item to the carousel
    .addItems(app.buildOptionItem('EGYPT',
      ['religion', 'egpyt', 'ancient egyptian'])
      .setTitle('Ancient Egyptian religion')
      .setDescription('42 gods who ruled on the fate of the dead in the ' +
        'afterworld. Throughout the under…')
      .setImage('http://example.com/egypt', 'Egypt')
    )
    // Add third item to the carousel
    .addItems(app.buildOptionItem('RECIPES',
      ['recipes', 'recipe', '42 recipes'])
      .setTitle('42 recipes with 42 ingredients')
      .setDescription('Here\'s a beautifully simple recipe that\'s full ' +
        'of flavor! All you need is some ginger and…')
      .setImage('http://example.com/recipe', 'Recipe')
    )
  );
}
JSON
{
    "conversationToken": "{\"state\":null,\"data\":{}}",
    "expectUserResponse": true,
    "expectedInputs": [
        {
            "inputPrompt": {
                "richInitialPrompt": {
                    "items": [
                        {
                            "simpleResponse": {
                                "textToSpeech": "Alright! Here are a few things you can learn. Which sounds interesting?"
                            }
                        }
                    ],
                    "suggestions": [
                        {
                            "title": "Basic Card"
                        },
                        {
                            "title": "List"
                        },
                        {
                            "title": "Carousel"
                        },
                        {
                            "title": "Suggestions"
                        }
                    ]
                }
            },
            "possibleIntents": [
                {
                    "intent": "actions.intent.OPTION",
                    "inputValueData": {
                        "@type": "type.googleapis.com/google.actions.v2.OptionValueSpec",
                        "carouselSelect": {
                            "items": [
                                {
                                    "optionInfo": {
                                        "key": "MATH_AND_PRIME",
                                        "synonyms": [
                                            "math",
                                            "math and prime",
                                            "prime numbers",
                                            "prime"
                                        ]
                                    },
                                    "title": "Math & prime numbers",
                                    "description": "42 is an abundant number because the sum of its proper divisors 54 is greater…",
                                    "image": {
                                        "url": "http://example.com/math_and_prime.jpg",
                                        "accessibilityText": "Math & prime numbers"
                                    }
                                },
                                {
                                    "optionInfo": {
                                        "key": "EGYPT",
                                        "synonyms": [
                                            "religion",
                                            "egpyt",
                                            "ancient egyptian"
                                        ]
                                    },
                                    "title": "Ancient Egyptian religion",
                                    "description": "42 gods who ruled on the fate of the dead in the afterworld. Throughout the under…",
                                    "image": {
                                        "url": "http://example.com/egypt",
                                        "accessibilityText": "Egypt"
                                    }
                                },
                                {
                                    "optionInfo": {
                                        "key": "RECIPES",
                                        "synonyms": [
                                            "recipes",
                                            "recipe",
                                            "42 recipes"
                                        ]
                                    },
                                    "title": "42 recipes with 42 ingredients",
                                    "description": "Here's a beautifully simple recipe that's full of flavor! All you need is some ginger and…",
                                    "image": {
                                        "url": "http://example.com/recipe",
                                        "accessibilityText": "Recipe"
                                    }
                                }
                            ]
                        }
                    }
                }
            ]
        }
    ]
}

Handling selected item

When users select an item, the selected item value is passed to you as an argument. You can use the client library to read the value by calling app.getContextArgument(). In the returned value, you will get the key identifier for the selected item:

function itemSelected (app) {
  // Get the user's selection
  const param = app.getContextArgument('actions_intent_option',
    'OPTION').value;

  // Compare the user's selections to each of the item's keys
  if (!param) {
    app.ask('You did not select any item from the list or carousel');
  } else if (param === 'MATH_AND_PRIME') {
    app.ask('42 is an abundant number because the sum of its…');
  } else if (param === 'EGYPT') {
    app.ask('42 gods who ruled on the fate of the dead in the ');
  } else if (param === 'RECIPES') {
    app.ask('Here\'s a beautifully simple recipe that\'s full ');
  } else {
    app.ask('You selected an unknown item from the list or carousel');
  }
}

Suggestion Chip

Requirements
  • Supported on actions.capability.SCREEN_OUTPUT surfaces
  • Max number of chips: 8
  • Max text length: 25 characters
  • Supports only plain text
Guidance

Use suggestion chips to hint at responses to continue or pivot the conversation. If during the conversation there is a primary call for action, consider listing that as the first suggestion chip.

Whenever possible, you should incorporate one key suggestion as part of the chat bubble, but do so only if the response or chat conversation feels natural.

Sample code

Node.js
function suggestionChips (app) {
  app.ask(app.buildRichResponse()
    .addSimpleResponse({speech: 'Howdy! I can tell you fun facts about ' +
        'almost any number like 0, 42, or 100. What number do you have ' +
        'in mind?',
      displayText: 'Howdy! I can tell you fun facts about almost any ' +
        'number. What number do you have in mind?'})
    .addSuggestions(
      ['0', '42', '100', 'Never mind'])
    .addSuggestionLink('Suggestion Link', 'https://assistant.google.com/')
  );
}
JSON
{
    "conversationToken": "{\"state\":null,\"data\":{}}",
    "expectUserResponse": true,
    "expectedInputs": [
        {
            "inputPrompt": {
                "richInitialPrompt": {
                    "items": [
                        {
                            "simpleResponse": {
                                "textToSpeech": "Howdy! I can tell you fun facts about almost any number like 0, 42, or 100. What number do you have in mind?",
                                "displayText": "Howdy! I can tell you fun facts about almost any number. What number do you have in mind?"
                            }
                        }
                    ],
                    "suggestions": [
                        {
                            "title": "0"
                        },
                        {
                            "title": "42"
                        },
                        {
                            "title": "100"
                        },
                        {
                            "title": "Never mind"
                        }
                    ],
                    "linkOutSuggestion": {
                        "destinationName": "Suggestion Link",
                        "url": "https://assistant.google.com/"
                    }
                }
            },
            "possibleIntents": [
                {
                    "intent": "actions.intent.TEXT"
                }
            ]
        }
    ]
}

UI Checklist

The following checklist highlights common things you can do to make sure your responses appear appropropriately on the surface users are experiencing your actions on.

Cards and Options
Use cards and options

Cards and options let you display information in a richer, more customizable format.

  • Basic card - If you need to present a lot of text to the user, use a basic card. A card can display up to 15 lines of text, and link to a website for further reading. Unlike chat bubbles, the card supports text formatting. You can also add an image and a list or carousel to display options.
  • List - If you are asking the user to pick from a list of choices, consider using a list instead of writing out the list in a chat bubble.
  • Carousel - If you want to the user to pick from a list of choices with a focus on larger images, use a carousel, which has a limit of 8 items.

Suggestion Chips
Use them after most turns

The best thing you can do to increase your app's usability on devices with screens is to add chips, so the user can quickly tap to respond in addition to using voice or the keyboard. For example, any yes/no question should have suggestion chips for **Yes** and **No**.

When there are a few choices...

When offering the user a small number of choices (8 or less) you can add suggestion for each choice (present them in the same order as in your TTS, and using the same terminology).

When there are many choices...

If you ask a question with a wide range of possible answers, present a few of the most popular answers.

Chat Bubbles
Correct capitalization and punctuation

Now that your TTS strings can show up as chat bubbles, check your them for correct capitalization and punctuation.

Fix phonetic spellings

If you spelled something out phonetically in your TTS to help with a pronunciation issue, then that phonetic misspelling will appear in your chat bubble. Use different display text to use correct spelling for chat bubbles on devices with screens.

Avoid truncation

Chat bubbles are limited to 640 characters and are truncated after that limit (however, we recommend around 300 as a general, design guideline). If you have more than that, you can:

  • Use a 2nd chat bubble - Up to 2 chat bubbles are allowed per turn, so find a natural break point and create a second chat bubble.
  • Don't show everything - If you are presenting long TTS content, consider showing only a subset of the TTS content in the chat bubble, such as just an introduction. You can use shorter display text than TTS text in this case.

Recorded Audio
Remove <audio> text from chat bubbles

If you have text inside your SSML <audio> tag, it's displayed in your corresponding chat bubble. For example, if your SSML is:

<speak>
  Here's that song.
  <audio src="...">song audio</audio>
</speak>

your chat bubble text appears as "Here's that song. song audio".

Instead, add a <desc> element inside your<audio> element. Any text inside <desc> is displayed, and any text outside <audio> is used as the alternate text if the audio source file cannot be loaded. For example:

<speak>
  Here's that song.
  <audio src="bad_url"><desc></desc>song audio</audio>
</speak>

results in the audio output: "Here's that song. song audio" and the chat bubble text: Here's that song.

Alternatively, you can just remove the text from your <audio> tag altogether, or use the SSML <sub> tag.

Eliminate empty chat bubbles

Every dialog turn is required to have at least one chat bubble. If your action has dialogs that are composed of only streaming audio (no TTS) then the chat bubble text will be missing and your app will fail. In these cases, add display text that matches the words in your recorded audio, or the introduction.