このページは Cloud Translation API によって翻訳されました。

Android NDK（C）でのユーザーの環境を理解する

独自のアプリで Scene Semantics API を使用する方法を学びます。

Scene Semantics API を使用すると、ML モデルベースのリアルタイムのセマンティック情報を提供することで、デベロッパーはユーザーの周囲のシーンを理解できます。屋外シーンの画像を指定すると、API は、空、建物、木、道路、歩道、車両、人物など、有用なセマンティッククラスのセットで各ピクセルのラベルを返します。Scene Semantics API は、ピクセルラベルに加えて、各ピクセルラベルの信頼度値も提供します。また、屋外シーンで特定のラベルがどの程度存在するかを簡単にクエリすることもできます。

左から右に、入力画像の例、ピクセルラベルのセマンティック画像、対応する信頼度画像を示します。

入力画像、セマンティック画像、セマンティック信頼度の画像の例。

前提条件

続行する前に、基本的な AR コンセプトと ARCore セッションを構成する方法を理解してください。

Scene Semantics を有効にする

新しい ARCore セッションで、ユーザーのデバイスが Scene Semantics API をサポートしているかどうかを確認します。処理能力の制約により、ARCore 対応デバイスのすべてが Scene Semantics API をサポートしているわけではありません。

リソースを節約するため、ARCore では Scene Semantics はデフォルトで無効になっています。セマンティックモードを有効にして、アプリで Scene Semantics API を使用できるようにします。

// Check whether the user's device supports the Scene Semantics API.
int32_t is_scene_semantics_supported = 0;
ArSession_isSemanticModeSupported(ar_session, AR_SEMANTIC_MODE_ENABLED, &is_scene_semantics_supported);

// Configure the session for AR_SEMANTIC_MODEL_ENABLED.
ArConfig* ar_config = NULL;
ArConfig_create(ar_session, &ar_config);
if (is_scene_semantics_supported) {
  ArConfig_setSemanticMode(ar_session, ar_config, AR_SEMANTIC_MODE_ENABLED);
}
CHECK(ArSession_configure(ar_session, ar_config) == AR_SUCCESS);
ArConfig_destroy(ar_config);

セマンティック画像を取得する

シーンセマンティクスが有効になると、セマンティック画像を取得できます。セマンティック画像は AR_IMAGE_FORMAT_Y8 画像であり、各ピクセルは ArSemanticLabel で定義されたセマンティックラベルに対応しています。

ArFrame_acquireSemanticImage() を使用してセマンティック画像を取得します。

// Retrieve the semantic image for the current frame, if available.
ArImage* semantic_image = NULL;
if (ArFrame_acquireSemanticImage(ar_session, ar_frame, &semantic_image) != AR_SUCCESS) {
  // No semantic image retrieved for this frame.
  // The output image may be missing for the first couple frames before the model has had a chance to run yet.
  return;
}
// If a semantic image is available, use it here.

出力セマンティック画像は、デバイスに応じて、セッション開始から約 1 ～ 3 フレーム後に利用できるようになります。

信頼性の画像を取得する

API は、ピクセルごとにラベルを提供するセマンティック画像に加えて、対応するピクセル信頼度の信頼度画像も提供します。信頼度の画像は AR_IMAGE_FORMAT_Y8 画像です。各ピクセルは [0, 255] の範囲内の値に対応し、各ピクセルのセマンティックラベルに関連付けられた確率に対応します。

ArFrame_acquireSemanticConfidenceImage() を使用してセマンティック信頼度の画像を取得します。

// Retrieve the semantic confidence image for the current frame, if available.
ArImage* semantic_confidence_image = NULL;
if (ArFrame_acquireSemanticConfidenceImage(ar_session, ar_frame, &semantic_confidence_image) != AR_SUCCESS) {
  // No semantic confidence image retrieved for this frame.
  // The output image may be missing for the first couple frames before the model has had a chance to run yet.
  return;
}
// If a semantic confidence image is available, use it here.

出力信頼性画像は、デバイスに応じて、セッション開始から約 1 ～ 3 フレーム後に利用可能になります。

セマンティックラベルのピクセル数の割合をクエリする

現在のフレーム内の特定のクラス（空など）に属するピクセルの割合をクエリすることもできます。このクエリは、セマンティック画像を返して特定のラベルをピクセル単位で検索するよりも効率的です。返される小数は、[0.0, 1.0] の範囲内の浮動小数点数値です。

ArFrame_getSemanticLabelFraction() を使用して、指定したラベルの分数を取得します。

// Retrieve the fraction of pixels for the semantic label sky in the current frame.
float out_fraction = 0.0f;
if (ArFrame_getSemanticLabelFraction(ar_session, ar_frame, AR_SEMANTIC_LABEL_SKY, &out_fraction) != AR_SUCCESS) {
  // No fraction of semantic labels was retrieved for this frame.
}

Android NDK（C）でのユーザーの環境を理解する コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。