A voice interaction is a special kind of Android activity that is triggered by the user's voice, that also lets them complete an action by voice. In contrast, normal activity intents are designed to start an action but complete through touch interaction.
For example, the Android DIAL intent is implemented by a dialer application but the intent only preloads the phone number into the dialer - the user has to touch a button to actually initiate the call.
Designating a voice interaction
Applications designate a voice interaction activity in their Android manifest file through the android.intent.category.VOICE category.
The following example shows how you could specify a voice interaction activity:
<activity android:name="org.example.MyVoiceActivity">
<intent-filter>
<action android:name="org.example.MY_ACTION_INTENT" />
<category android:name="android.intent.category.DEFAULT" />
<category android:name="android.intent.category.VOICE" />
</intent-filter>
</activity>
The next example shows how you might specify the corresponding touch interaction activity:
<activity android:name="org.example.MyTouchActivity">
<intent-filter>
<action android:name="org.example.MY_ACTION_INTENT" />
<category android:name="android.intent.category.DEFAULT" />
</intent-filter>
</activity>
Handling the voice interaction
Once your application receives the intent, it should determine the appropriate interaction. What is appropriate will depend on the nature of your app: how much interaction do you want to ask of the user, how safe is it to simply complete the action, and so on.
Approval using voice interaction
Often times, users need to verify the action that will occur as a result of their request. For example, a user's request to book a taxi may include all the details of the request (time, place, and destination) but additional details (like the cost and arrival time) should be verified with the user.
This example shows how to handle a voice interaction with voice confirmation:
class MyVoiceActivity extends Activity {
class Confirm extends VoiceInteractor.ConfirmationRequest {
public Confirm(String ttsPrompt, String visualPrompt) {
VoiceInteractor.Prompt prompt = new VoiceInteractor.Prompt(
new String[] {ttsPrompt}, visualPrompt);
super(prompt, null);
}
@Override
public void onConfirmationResult(
boolean confirmed, Bundle null) {
if (confirmed) {
doAction();
}
finish();
}
};
@Override
public void onResume() {
if (isVoiceInteraction()) {
String ttsPrompt = getConfirmationTts();
String visualPrompt = getConfirmationDisplayText();
getVoiceInteractor().sendRequest(new Confirm(ttsPrompt, visualPrompt));
} else {
finish();
}
}
}
In this example the activity requests confirmation when started as a voice interaction. It first checks that it is a voice activity by calling isVoiceInteraction(). The subclass of VoiceInteractor.ConfirmationRequest handles the asynchronous confirmation from the user (either accepted or rejected) to determine if the action should be completed.
In this case, the user will receive the spoken prompt and be asked to verbally accept the confirmation (e.g. with a yes or no). The activity will receive a callback indicating the result.
Approving without confirmation
In some cases, it is sufficient to know that the intent originated from the user's voice, implying that no further information is needed to confirm the intent. For example placing a phone call or sending an email may be a reasonable interaction to support.
The isVoiceInteractionRoot()
call ensures that the intent originated
from the user’s voice. While isVoiceInteraction()
returns true even if
another app’s voice interaction activity launches your voice activity without
any user intent, isVoiceInteractionRoot()
returns true only if the activity
was started directly by a user’s voice action in the Google Search App.
The next example shows how you might handle a voice interaction without confirmation:
class MyVoiceActivity extends Activity {
@Override
public void onResume() {
if (isVoiceInteractionRoot()) {
// Interaction started by the users voice
doAction();
}
finish();
}
}
If the activity is not a voice activity, it should be completed by touch. Voice interactions should always "fall back" to touch communication when appropriate.
Finishing the Voice Interaction
After your app finishes asking the user questions, your app should call
finish()
to close the Dialog Plate. Otherwise, the empty Dialog Plate
continues to be displayed at the bottom of the screen, which takes up screen
space and confuses users.
If you'd like to keep the user in your Activity, you can start a new
normal Activity and finish the voice interaction activity. Start your Activity
with the Intent.FLAG_ACTIVITY_NEW_TASK
flag to exit the voice interaction
task.
class MyVoiceActivity extends Activity {
@Override
public void onResume() {
if (isVoiceInteractionRoot()) {
doAction();
}
// Start my main non-voice Activity
Intent intent = new Intent(this, MyMainActivity.class);
intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
startActivity(intent);
finish();
}
}
Sending completion status
Communicating the completion status of a voice interaction is a vital part of making individual actions into a valuable assistive experience for the user. For example:
- The result of some actions can be made available to the user for future use - for example the booking details of a restaurant reservation may be automatically made available as a reminder card in Google Now or from the [my reservations] query.
- Activities can choose to return a context URI (e.g. a content provider URI on the device or a Web URL deeplink) indicating a reference to the action result. This allows apps to recognize when an action is a "follow on" to a previous action (e.g. cancel the alarm that was just set / share the photo that was just taken)
- Indicating when the action was unable to be completed. A critical part of building user trust in the system is handling errors or the limitations of voice interaction responsibly. For example, going through a login flow or entering a credit card may not be feasible through voice. This is particularly important when dealing with eyes free interaction in an automotive environment.
The following example shows how to report completion status of an interaction:
class MyVoiceActivity extends Activity {
@Override
public void onResume() {
if (isVoiceInteraction()) {
Bundle status = new Bundle();
VoiceInteractor.Request request = null;
if (doAction(status)) {
request = new VoiceInteractor.CompleteVoiceRequest(
"Success", status);
} else {
request = new VoiceInteractor.AbortVoiceRequest(
"Too Complex", status);
}
getVoiceInteractor().sendRequest(request);
}
finish();
}
}
Multi-modal interaction
Users often combine voice and touch interaction when completing an action for a variety of reasons (e.g. speech recognition issues, picking from a list of many options, etc) so it's important to support both interaction modalities.
The voice activity should handle displaying a minimal touch interface when interacting by voice. Some things to keep in mind:
- Only show controls needed to complete the current interaction. The portion of the touch display available to the activities view may be reduced to show the voice interaction bar. Applications should not show the App bar.
- If the user interacts by touch while a voice interaction is currently pending (e.g. the user confirms by pressing a button rather than through voice), call the VoiceInteractor.Request.cancel to stop voice interaction.
- Once the user has switched to touch interaction, they should continue with touch interaction.