This site has been permanently archived. The content on this site was last updated in 2019.

All developers actively developing experiences for Google Cardboard should use the open source Cardboard SDKs for iOS, Android NDK, and Unity XR Plugin.
The Daydream View VR headset is no longer available for purchase as of October 15, 2019. As of November 2023, previously supported devices will no longer be able to download and install Google VR Serivces (Android application ID com.google.vr.vrcore) for new users.

Google VR NDK Spatial Audio

Google VR contains a powerful spatial audio rendering engine which is optimized for mobile VR. It allows the user to spatialize sound sources in 3D space, including distance and elevation cues. Specifically, the API is capable of playing back spatial sound in two ways:

Sound object rendering: This allows the user to create a virtual sound source in 3D space. These sources, while spatialized, are fed with mono audio data.
Ambisonic soundfields: Ambisonic recordings are multi-channel audio files which are spatialized all around the listener in 360 degrees. These can be thought of as recorded or pre-baked soundfields. They can be of great use for background effects which sound perfectly spatial. Examples include rain noise, crowd noise or even the sound of the ocean off to one side.

Using the AudioApi

The main entry point for the audio API is the AudioApi class.

Create an AudioApi instance and initialize it at the start of your app. During initialization, a rendering configuration must be specified:

GVR_AUDIO_RENDERING_STEREO_PANNING: Stereo panning of all sound objects. This disables HRTF-based rendering.
GVR_AUDIO_RENDERING_BINAURAL_LOW_QUALITY: This renders sound objects over a virtual array of 8 loudspeakers arranged in a cube configuration around the listener’s head. HRTF-based rendering is enabled.
GVR_AUDIO_RENDERING_BINAURAL_HIGH_QUALITY: This renders sound objects over a virtual array of 16 loudspeakers arranged in an approximate equidistribution about the listener’s head. HRTF-based rendering is enabled.

For most modern mobile phones the high quality mode offers a good balance between performance and audio quality. On Android, initialization also requires a pointer to the JNI environment, the Android application context (note: not the Activity context) and the app's main class loader.

Audio playback on the default audio device can be started and stopped by calling the Pause() and Resume() methods.

Please note that the Update() method must be also called from the main thread at a regular rate. It is used to execute background operations outside of the audio thread.

// Create AudioApi instance.
std::unique_ptr<gvr::AudioApi> audio_api(new gvr::AudioApi);

void Initialize() {
   // Initialize it and start audio engine.
   if (!audio_api->Init(jni_env, android_context, class_loader,
                        GVR_AUDIO_RENDERING_BINAURAL_HIGH_QUALITY)) {
     // Handle failure. Do not proceed in case of failure (calling other
     // controller_api methods without a successful Init will crash with
     // an assert failure.
     return;
   }
}

void OnPause() {
  // Pause audio engine.
   audio_api->Pause();
}

void OnResume() {
  // Resume audio engine.
   audio_api->Resume();
}

void DrawFrame() {
  // Regular call to audio_api from the main thread.
  audio_api->Update();
}

Update listener head position and orientation

To ensure that the audio in your application reacts to user head movement, it is important to update the user’s head orientation in the graphics callback using the head orientation matrix. To obtain the head pose in start space call GvrApi::GetHeadPoseInStartSpace and pass its rotation to GvrAudioApi::SetHeadRotation.

void DrawFrame() {
   gvr::HeadPose head_pose = gvr_api_->GetHeadPoseInStartSpace(target_time);
   gvr_audio_api_->SetHeadRotation(head_pose_.rotation);
}

Spatialization of sound objects

GVR Audio allows the user to create virtual sound objects which can be placed anywhere in space around the listener. To create a new sound object from a preloaded audio sample, use CreateSoundObject(). It returns a handler that can be used to set properties such as the position and the volume of the sound object via SetSoundObjectPosition() and SetSoundVolume(). The spatialized playback of the sound can be triggered with PlaySound() and stopped with StopSound(). IsSoundPlaying() can be used to check if a sound object is currently active.

Note that the sound object handle destroys itself at the moment the sound playback has stopped. This way, no cleanup of sound object handles is needed.

// Preload sound file, create sound handle and start playback.
Static std::string kSoundFile = “sound.wav”;
gvr_audio_api->PreloadSoundfile(kSoundFile);
AudioSourceId source_id = gvr_audio_api_->CreateSoundObject(kSoundFile);
gvr_audio_api->SetSoundObjectPosition(source_id, position_x, position_y, position_z);
gvr_audio_api->PlaySound(source_id, true /* looped playback */);

Rendering of ambisonic soundfields

The GVR Audio System also provides the user with the ability to play back ambisonic soundfields. Ambisonic soundfields are captured or pre-rendered 360 degree recordings. It is best to think of them as equivalent to 360 degree video. While they envelop and surround the listener, they only react to the listener's rotational movement. That is, one cannot walk towards features in the soundfield. Soundfields are ideal for accompanying 360 degree video playback, for introducing background and environmental effects such as rain or crowd noise, or even for pre baking 3D audio to reduce rendering costs. The GVR Audio System supports full 3D First Order Ambisonic recordings using ACN channel ordering and SN3D normalization.

For more information please see our Spatial Audio specification on Github.

To obtain a soundfield handle, call CreateSoundfield(). It returns a handle that allows the user to begin playback of the soundfield, to alter the soundfield’s volume or to stop soundfield playback and as such destroy the object.

Sound Files and Preloading

Both mono sound files for use with Sound Objects and multi-channel Ambisonic sound files can be preloaded into memory before playback with the PreloadSoundfile() method or alternatively streamed during playback. Preloading can be useful to reduce CPU usage especially if the same audio clip is likely to be played back many times. In this case playback latency is also reduced. To clean-up unused sound files, they can be unloaded with UnloadSoundfile().

Sound playback

Start the playback of a sound.

gvr_audio_api_->PlaySound(source_id, true /* looped playback */);

Check if a sound is playing.

bool is_playing = gvr_audio_api_->IsSoundPlaying(source_id);

Pause the playback of a sound.

gvr_audio_api_->PauseSound(source_id);

Resume the playback of a sound.

gvr_audio_api_->ResumeSound(source_id);

Stop the playback of a sound and destroy the corresponding Sound Object or soundfield.

gvr_audio_api_->StopSound(source_id);

Check that a sourceID corresponds to a valid source which exists and is in a playable state.

bool is_valid gvr_audio_api_->IsSourceIdValid(source_id);

Room effects

GVR Audio provides a powerful reverb engine which can be used to create customized room effects by specifying the size of a room and a material for each surface of the room from the enum AudioMaterialName. Each of these surface materials has unique absorption properties which differ with frequency. The room created will be centered around the listener. Note that the Google VR Audio System uses meters as the unit of distance throughout.

EnableRoom() enables or disables room effects with smooth transitions. SetRoomProperties() allows the user to describe the room based on its dimensions and its surface properties. For example, one can expect very large rooms to be more reverberant than smaller rooms and a room with hard surface materials such as brick to be more reverberant than one with soft absorbent materials such as heavy curtains on every surface.

Note that when a sound source is located outside of the room the listener is in, it will sound different to sources located within the room due to attenuation of both the direct sound and the reverb on that source. Sources located far outside the room the listener is in will not be audible to the listener.