API.AI Webhook Format

Most of your agent's "speech responses" will be generated by business logic in a webhook. For agents integrating with Actions on Google, we extend the API.AI webhook protocol.

Please refer to the API.AI documentation on the generic/default webhook protocol

Format of request to the webhook

In the request to the webhook, the Google Assistant provides data on the user (such as user ID and location) and the conversational context in each request.

The following example POST request highlights the Google extensions to the API.AI request format:

{
  "id": "209eefa7-adb5-4d03-a8b9-9f7ae68a0c11",
  "timestamp": "2016-10-10T07:41:40.098Z",
  "result": {
    "source": "agent",
    "resolvedQuery": "Hi, my name is Sam!",
    "action": "greetings",
    "actionIncomplete": false,
    "parameters": {
      "user_name": "Sam"
    },
    "contexts": [
      {
        "name": "greetings",
        "parameters": {
          "user_name": "Sam",
          "user_name.original": "Sam!"
        },
        "lifespan": 5
      }
    ],
    "metadata": {
      "intentId": "373a354b-c15a-4a60-ac9d-a9f2aee76cb4",
      "webhookUsed": "true",
      "intentName": "greetings"
    },
    "fulfillment": {
      "speech": "Nice to meet you, Sam!"
    },
    "score": 1
  },
  "originalRequest": {
    "data": {
       "user": {
           "user_id": "...",
           "profile": {
               "display_name": "Sam",
               "given_name": "Sam",
               "family_name": "Johnson"
           },
          "access_token": "..."
       },
       "device": {
           "location": {
               "coordinates": {
                   "latitude": 123.456,
                   "longitude": -123.456
               },
               "formatted_address": "1234 Random Road, Anytown, CA 12345, United States",
               "zip_code": "12345",
               "city": "Anytown"
           }
       }
       ...
    }
  },
  "status": {
    "code": 200,
    "errorType": "success"
  },
  "sessionId": "37151f7c-a409-48b8-9890-cd980cd2548e"
}

Note that the Content-type for this request is "application/json".

The result.originalRequest.data object in the request body JSON matches the format of the Action Conversation Protocol HTTP Request.

Format of response from the webhook

You can provide instructions to the Assistant in each response to leverage certain capabilities like SSML and microphone control.

Note that the value of the speech property can only be ASCII characters.

{
  "speech": "...",  // ASCII characters only
  "displayText": "...",
  "data": {
    "google": {
      "expect_user_response": true,
      "is_ssml": true,
      "permissions_request": {
        "opt_context": "...",
        "permissions": [
          "NAME",
          "DEVICE_COARSE_LOCATION",
          "DEVICE_PRECISE_LOCATION"
        ]
      }
    }
  },
  "contextOut": [...],
}

The data.google object in the response body JSON should contain the following fields:

Field Type Description
expect_user_response Boolean If false, this will end the session with the user on Google Home and close the microphone.
is_ssml Boolean Indicates whether the speech field in the webhook response contains SSML.
permissions_request {
 "opt_context": String,
 "permissions": [ "NAME", "DEVICE_PRECISE_LOCATION", "DEVICE_COARSE_LOCATION" ]
}
Specified in order to request user's permission to access profile and device information. `opt_context` string provides TTS explaining why agent needs to request permission.
no_input_prompts [
 {
  text_to_speech: "I might've missed it if you said something."
 },
 {   ssml: "<audio src='https://example.com/sound.wav'>sound</audio>"
 }
]
Specified to customize the re-prompt that the Assistant will say if the user does not give any reply. Format matches the no_input_prompts in the Conversation Protocol