About see-through mode

See-through mode is a feature of Google VR SDKs for Android and for Unity. It lets you add augmented reality (AR) experiences to your users' experiences.

See-through mode supports scenes that can be characterized as predominantly virtual scenes or predominantly augmented scenes:

  • Predominantly virtual scenes. These are scenes where the background environment is rendered by the app, potentially with some small holes left to show the real world. Examples of this would be a virtual room with windows showing the real world, or an existing VR app with a newly added portal that sees through into the real world.

  • Predominantly augmented scenes. These are scenes where the background environment consists predominantly of unaltered see-through mode and the app is rendering virtual objects intended to appear alongside real objects. Examples of this would be a virtual monitor designed to float above your real desk, or a furniture arranging app that lets you place virtual chairs in your real home.

Spatial and temporal offsets with see-through mode

In predominantly virtual scenes, the goal is to render the image based on the position of the user's eyes at the time the frame is composited onto the screen. When an app calls gvr_get_head_space_from_start_space_transform to get the head's position, it includes a timestamp that estimates when the frame rendered with that head pose will be composited. It then uses gvr_get_eye_from_head_matrix to get the eye's position from the headset's position.

Ideally, see-through mode would behave the same, but there are practical considerations. Each see-through mode image is "rendered" at the position of the physical tracking camera rather than at the position of the eyes. This means that each see-through mode image has some spatial offset.

Also, each see-through mode image arrives on screen with some latency. This means that it is rendered not at the head position from when it is composited, but at the head position from some time before that. The image's rotation is reprojected to correct for this, but any translational differences cannot be reprojected. This means that each see-through mode image has some temporal offset.

These two offsets create noticeable visual artifacts. When a user turns their head, the real world as viewed through see-through mode moves too much as their viewpoint moves more than is normal. Real objects will appear closer than they actually are.

How do we correct for this?

In predominantly virtual scenes, no adjustments are needed. Your app should should continue to render virtual objects using existing best practices. Although see-through mode images covering small areas of the field of view will not behave quite right, this issue will not cause serious issues because users should be anchoring themselves primarily on the virtual environment.

In predominantly augmented scenes, however, a user's eyes will anchor primarily on the see-through mode image and adjust to the spatial offset and temporal offset. Even though the error is in the see-through mode image, the virtual objects will appear to swim from the perspective of the user. It's a better experience to have the virtual and real objects align in a slightly incorrect position than for only the virtual objects to be rendered at their correct physical positions. This means that the spatial offset and temporal offset need to be added to virtual objects for primarily augmented scenes.

When gvr_beta_see_through_config_set_scene_type sets the scene type to GVR_BETA_SEE_THROUGH_SCENE_TYPE_VIRTUAL_SCENE, gvr_get_eye_from_head_matrix automatically uses an earlier timestamp to align with the see-through mode images and returns a transformation to the position of the camera rather than the position of the eyes.