Conversational Actions

Conversational Actions extend the functionality of the Google Assistant by allowing developers to create custom experiences, or conversations, for users on the Assistant. In a conversation, your Conversational Action handles requests from the Assistant and returns responses with audio and visual components. Conversational Actions can also connect to external services for added conversational or business logic before returning a response.

For example, users can invoke your Conversational Action to get a response from your external fulfillment service when they want to look up information, get a personalized recommendation, or perform transactions involving digital payments.

In a back-and-forth conversation with the Google Assistant, a user
            asks about and receives an answer for when a conference session is
            happening.
Figure 1. An example of a Conversational Action

Use cases

Conversational Actions work best for simple use cases that complement another experience. Good Conversational Actions often fall into these general categories:

  • Things people can easily answer. Actions that can be accomplished with familiar input like times or dates, like booking a flight.
  • Quick, but compellingly useful Actions. These usually give users immediate benefit for very little time spent, like finding out when their favorite sports team plays next.
  • Actions that are inherently better suited for voice. These are typically things you want to do hands-free, like receiving coaching during yoga or light exercise.

How Conversational Actions work

Unlike with traditional mobile and desktop apps, which use computer-centric paradigms, users interact with Actions for the Assistant through natural-sounding, back and forth conversation. Conversational Actions begin when invoked by a user and continue until the user chooses to exit (using predetermined phrases) or your Conversational Action denotes the end of the conversation.

During a conversation, user inputs are transformed from speech to text by the Assistant, and formed into JSON requests for natural language processing. These requests are sent to what's known as your conversation fulfillment.

Your conversation fulfillment parses the user's query into structured data, processes that data, and returns a webhook JSON response to the Assistant. The Assistant then processes and presents your response to the user.

Conversation fulfillment can be represented with JSON request
            input and webhook JSON response output.
Figure 2. Conversation fulfillment is a JSON in-JSON out system

Building your own natural language processing service can be challenging, so we provide Dialogflow as a way to handle it for you. For developers who cannot use Dialogflow, we also provide the Actions SDK as a backup option with a separate, but related, development path.

Once you set up an agent in Dialogflow, your conversation fulfillment is augmented by Dialogflow's features, including the ability to use Dialogflow fulfillment. This approach allows you to isolate conversation fulfillment from other services you may need to provide users with their desired outcome.

Actions on Google parses a user utterance and sends a request to
            Dialogflow. Dialogflow matches the intent and extracts parameters to
            send to its corresponding Dialogflow fulfillment. The fulfillment
            then sends a response back to Actions on Google, which renders the
            response on an Assistant surface.
Figure 3. Conversation fulfillment when using Dialogflow

Building a Conversational Action

Most of building your Conversational Action is designing the conversation and building your conversation fulfillment. Think of the conversation as the user interface for your Conversational Action. You need to think about how users invoke your Actions project, the valid things that they can say in a conversation, and how your Actions project responds to them.

In your Actions project, you provide metadata for publishing the project and specify a method of conversation fulfillment. Developers using Dialogflow associate their Dialogflow agent with the project, then build fulfillment through Dialogflow. For developers using the Actions SDK, building conversation fulfillment involves coding and deploying in the Conversation Webhook format.

When designing your conversation, we recommend using our processes and design principles. Conversational interfaces are still a relatively new technology, and learning about best practices can save you time in the future.

Fulfillment using Dialogflow

When integrating with a Dialogflow agent, the agent handles NLU for user queries in your Conversational Action. Your Dialogflow agent does the following for you during this step:

  1. Parses each incoming request from the Assistant based on training phrases you provide and conversational context.
  2. Matches each request to a Dialogflow intent (also known as an event).
  3. Extracts parameters into Dialogflow entities.

Your Dialogflow agent can then call on its own fulfillment (deployed as a webhook) to carry out some logic like calling a REST API or other backend service that generates a response to return to the Assistant. This webhook is also known as your Dialogflow fulfillment.

Dialogflow accepts a user utterance for intent matching, provides
            extracted parameters to Dialogflow fulfillment. The fulfillment
            returns a response to the user.
Figure 4. A Dialogflow agent parses a user query into structured data for Dialogflow fulfillment

Building conversation fulfillment when using Dialogflow primarily consists of developing your Dialogflow fulfillment webhook. In the Actions on Google documentation, you'll find resources to help you design, build, and test your Dialogflow fulfillment webhook. Most notably, those resources include the Node.js client library and the Java client library.

As you build with Dialogflow, you'll use the Dialogflow Console to create Dialogflow intents, entities, and training phrases.

For more general information about Dialogflow, you can read about the Actions on Google integration in the Dialogflow documentation.

Fulfillment using Actions SDK

Building conversation fulfillment with the Actions SDK primarily consists of creating and deploying your Action package. Action packages are created in the ActionPackage format and use the Actions on Google Conversation HTTP/JSON Webhook API. An Action package contains all Actions for a given Actions project.

The Assistant provides user queries to your conversation fulfillment using Actions on Google intents. For each intent, your fulfillment webhook must parse the intent, process it, and return a JSON response to the Assistant for the user.

Responses

When you build an Action for the Assistant, you design your conversations for a variety of surfaces, such as a voice-centric conversation for voice-activated speakers or a visual conversation on a surface that the Assistant supports. This approach lets users get things done quickly through either voice or visual affordances.

As you build your fulfillment, you can select from a variety of engaging response types for the Assistant to present to users. These range from chat bubbles containing simple text to media responses, carousels, and even HTML using Interactive Canvas.

Next steps

Follow the Build Actions for the Google Assistant (Level 1) codelab for detailed step-by-step instructions to begin building your first Conversational Action.

Then, you can continue on to our guides for building your own conversation fulfillment with Dialogflow or with the Actions SDK. You can also explore these additional resources for building Conversational Actions: