AI-generated Key Takeaways
- 
          This document provides instructions for integrating the Google Assistant into your project using gRPC bindings. 
- 
          You can integrate the Google Assistant using Python, C++, Node.js, Android Things, or other languages. 
- 
          Integration involves authorizing and authenticating your Google account, obtaining OAuth tokens, registering your device, and implementing a basic conversation dialog using bidirectional streaming gRPC. 
- 
          You can extend the conversation dialog with Device Actions, get the transcript of user requests, and display text or visual responses from the Assistant. 
- 
          Text input is supported by setting the text_queryfield in theAssistConfig.
Follow the instructions in each section below to integrate the Google Assistant into your project.
gRPC bindings
The Google Assistant Service is built on top of gRPC, a high performance, open-source RPC framework. This framework is well-suited for bidirectional audio streaming.
Python
If you're using Python, get started using this guide.
C++
Take a look at our C++ sample on GitHub.
Node.js
Take a look at our Node.js sample on GitHub.
Android Things
Interested in embedded devices? Check out the Assistant SDK sample for Android Things.
Other languages
- Clone the googleapis repository to get the protocol buffer interface definitions for the Google Assistant Service API.
- Follow the gRPC documentation to generate gRPC bindings for your language of choice
- Follow the steps in the sections below.
Authorize and authenticate your Google account to work with the Assistant
The next step is to authorize your device to talk with the Google Assistant using your Google account.
Obtain OAuth tokens with the Assistant SDK scope
The Assistant SDK uses OAuth 2.0 access tokens to authorize your device to connect with the Assistant.
When prototyping, you can use the authorization tool to easily generate OAuth2.0
credentials from the client_secret_<client-id>.json file generated when
registering your device model.
Do the following to generate the credentials:
- Use a Python virtual environment to isolate the authorization tool and its dependencies from the system Python packages. - sudo apt-get update- sudo apt-get install python3-dev python3-venv # Use python3.4-venv if the package cannot be found.- python3 -m venv env- env/bin/python -m pip install --upgrade pip setuptools wheel- source env/bin/activate
- Install the authorization tool: - python -m pip install --upgrade google-auth-oauthlib[tool] 
- Run the tool. Remove the - --headlessflag if you are running this from a terminal on the device (not an SSH session):- google-oauthlib-tool --client-secrets /path/to/client_secret_client-id.json --scope https://www.googleapis.com/auth/assistant-sdk-prototype --save --headless 
When you are ready to integrate the authorization as part of the provisioning mechanism of your device, read our guides for Using OAuth 2.0 to Access Google APIs to understand how to obtain, persist and use OAuth access tokens to allow your device to talk with the Assistant API.
Use the following when working through these guides:
- OAuth scope: https://www.googleapis.com/auth/assistant-sdk-prototype
- Supported OAuth flows: - (Recommended) Installed apps
- Web server applications
 
Check out the best practices on privacy and security for recommendations on how to secure your device.
Authenticate your gRPC connection with OAuth tokens
Finally, put all the pieces together by reading how to use token-based authentication with Google to authenticate the gRPC connection to the Assistant API.
Register your device
Register your device model and instance either manually or with the registration tool (available in Python).
Implement a basic conversation dialog with the Assistant
- Implement a bidirectional streaming gRPC client for the Google Assistant Service API.
- Wait for the user to trigger a new request (e.g., wait for a GPIO interrupt from a button press).
- Send an - AssistRequestmessage with the- configfield set (see- AssistConfig). Make sure the- configfield contains the following:- The audio_in_configfield, which specifies how to process theaudio_indata that will be provided in subsequent requests (seeAudioInConfig).
- The audio_out_configfield, which specifies the desired format for the server to use when it returnsaudio_outmessages (seeAudioOutConfig).
- The device_configfield, which identifies the registered device to the Assistant (seeDeviceConfig).
- The dialog_state_infield, which contains thelanguage_codeassociated with the request (seeDialogStateIn).
 
- The 
- Start recording. 
- Send multiple outgoing - AssistRequestmessages with audio data from the spoken query in the- audio_infield.
- Handle incoming - AssistResponsemessages.
- Extract conversation metadata from the - AssistResponsemessage. For example, from- dialog_state_out, get the- conversation_stateand- volume_percentage(see- DialogStateOut).
- Stop recording when receiving a - AssistResponsewith an- event_typeof- END_OF_UTTERANCE.
- Play back audio from the Assistant answer with audio data coming from the - audio_outfield.
- Take the - conversation_stateyou extracted earlier and copy it into the- DialogStateInmessage in the- AssistConfigfor the next- AssistRequest.
With this, you should be ready to make your first requests to the Google Assistant through your device.
Extend a conversation dialog with Device Actions
Extend the basic conversation dialog above to trigger the unique hardware capabilities of your particular device:
- In the incoming AssistResponsemessages, extract thedevice_actionfield (seeDeviceAction).
- Parse the JSON payload of the device_request_jsonfield. Refer to the Device Traits page for the list of supported traits. Each trait schema page shows a sample EXECUTE request with the device command(s) and parameters that are returned in the JSON payload.
Get the transcript of the user request
If you have a display attached to the device, you might want to use it to
show the user request. To get this transcript, parse the speech_results field
in the AssistResponse
messages. When the speech recognition completes, this list will contain one item
with a stability set to 1.0.
Get the text and/or visual rendering of the Assistant's response
If you have a display attached to the device, you might want to use it to
show the Assistant's plain text response to the user's request. This text is located
in the DialogStateOut.supplemental_display_text
field.
The Assistant supports visual responses via HTML5 for certain queries (What
is the weather in Mountain View? or What time is it?). To enable this, set
the screen_out_config field in AssistConfig.
The ScreenOutConfig
message has field screen_mode which should be set to PLAYING.
The AssistResponse
messages will then have field screen_out set. You can extract the HTML5 data (if present) from the
data field.
Submitting queries via text input
If you have a text interface (for example, a keyboard) attached to the device,
set the text_query field in the config field (see AssistConfig).
Do not set the audio_in_config field.
Troubleshooting
See the Troubleshooting page if you run into issues.
