Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Báo cáo về thử nghiệm

Có 2 cách chính để báo cáo về các thử nghiệm:

Báo cáo trực tiếp về thử nghiệm: Truy vấn tài nguyên experiment để xem các chỉ số. Tuỳ chọn này cung cấp các chỉ số cho nhóm đối chứng và nhóm thử nghiệm trong một phản hồi, cùng với dữ liệu so sánh thống kê như mức tăng và giá trị p. Đây là cách duy nhất để báo cáo về các thử nghiệm trong chiến dịch.
Báo cáo chiến dịch: Truy vấn tài nguyên campaign để xem các chỉ số, sử dụng campaign.experiment_type để phân biệt giữa chiến dịch cơ sở và chiến dịch thử nghiệm. Tuỳ chọn này chỉ áp dụng cho các thử nghiệm sử dụng chiến dịch đối chứng và chiến dịch thử nghiệm riêng biệt, chẳng hạn như các thử nghiệm do hệ thống quản lý.

Hướng dẫn này tập trung chủ yếu vào tính năng báo cáo trực tiếp về thử nghiệm, tương thích với mọi loại thử nghiệm hỗ trợ tính năng báo cáo.

Báo cáo trực tiếp về thử nghiệm

Bạn có thể truy vấn trực tiếp tài nguyên experiment để truy xuất các chỉ số hiệu suất và so sánh thống kê giữa nhóm đối chứng và nhóm thử nghiệm.

Chỉ số và ý nghĩa thống kê

Đối với các chỉ số chính như số lượt nhấp, số lượt hiển thị, chi phí, số lượt chuyển đổi và giá trị lượt chuyển đổi, tài nguyên experiment cung cấp cả chỉ số của nhóm thử nghiệm (ví dụ: metrics.clicks) và chỉ số của nhóm đối chứng (ví dụ: metrics.control_clicks) trong cùng một hàng.

Tài nguyên này cũng cung cấp các trường để giúp bạn đánh giá ý nghĩa thống kê của mọi sự khác biệt giữa các nhóm:

metrics.*_p_value: Xác suất xảy ra kết quả quan sát được nếu thử nghiệm không có tác động thực tế đến chỉ số. Giá trị p càng thấp thì ý nghĩa thống kê càng cao.
metrics.*_point_estimate: Mức tăng phần trăm ước tính (dương hoặc âm) trong chỉ số đã cho đối với nhóm can thiệp so với nhóm đối chứng. Cùng với margin_of_error, các trường này mô tả một khoảng tin cậy có mức độ tin cậy được chỉ định cho sự khác biệt đang được ước tính. Số lượng đang được ước tính là (nhóm thử nghiệm / nhóm đối chứng – 1). Điểm ước tính là trung tâm của khoảng tin cậy.
metrics.*_margin_of_error: Bán kính của khoảng tin cậy, được căn giữa tại point_estimate. Giá trị này được tính cho một mức độ tin cậy được chỉ định, tuỳ thuộc vào loại thử nghiệm.

Các trường chỉ số chính sau đây được hỗ trợ trên tài nguyên experiment, bao gồm giá trị của nhóm thử nghiệm, giá trị của nhóm đối chứng và các trường chỉ số thống kê được liệt kê trước đó:

clicks
impressions
cost_micros
conversions
cost_per_conversion
conversion_value
conversion_value_per_cost

Đối với lượt chuyển đổi, cụ thể là các trường thống kê có sẵn thông qua các trường absolute_change sau đây, thay vì dưới dạng giá trị tương đối:

metrics.conversions_absolute_change_p_value: Giá trị p cho giả thuyết vô hiệu rằng thử nghiệm không ảnh hưởng đến sự thay đổi tuyệt đối về số lượt chuyển đổi. Phạm vi từ 0 đến 1.
metrics.conversions_absolute_change_point_estimate: Điểm ước tính khi ước tính tác động của thử nghiệm đối với sự thay đổi tuyệt đối về số lượt chuyển đổi.
metrics.conversions_absolute_change_margin_of_error: Giới hạn lỗi khi ước tính tác động của thử nghiệm đối với sự thay đổi tuyệt đối về số lượt chuyển đổi.

Để được trợ giúp xây dựng các truy vấn hợp lệ cho tài nguyên experiment, hãy sử dụng công cụ Trình tạo truy vấn của Google Ads.

Cụm từ tìm kiếm ví dụ

Truy vấn GAQL sau đây truy xuất các chỉ số chính cho một thử nghiệm:

SELECT
  experiment.experiment_id,
  experiment.name,
  experiment.type,
  metrics.clicks,
  metrics.control_clicks,
  metrics.clicks_point_estimate,
  metrics.clicks_margin_of_error,
  metrics.clicks_p_value,
  metrics.conversions,
  metrics.control_conversions,
  metrics.conversions_absolute_change_point_estimate,
  metrics.conversions_absolute_change_margin_of_error,
  metrics.conversions_absolute_change_p_value
FROM experiment
WHERE experiment.experiment_id = EXPERIMENT_ID

Diễn giải kết quả

Bạn có thể sử dụng các trường giá trị p, điểm ước tính và giới hạn lỗi để xác định xem thử nghiệm của bạn có mang lại kết quả có ý nghĩa thống kê hay không. Ví dụ: nếu conversions_absolute_change_p_value thấp hơn ngưỡng bạn chọn (ví dụ: 0,05 cho độ tin cậy 95%) và conversions_absolute_change_point_estimate - conversions_absolute_change_margin_of_error lớn hơn 0, thì điều này cho biết nhóm can thiệp đang hoạt động hiệu quả hơn đáng kể so với nhóm đối chứng về số lượt chuyển đổi.

Sau đây là đoạn mã Python minh hoạ cách đánh giá kết quả dựa trên giá trị p và ước tính mức tăng:

Java

private void evaluateExperiment(
    GoogleAdsClient googleAdsClient, long customerId, GoogleAdsRow row) {
  Metrics metrics = row.getMetrics();
  String experimentResourceName = row.getExperiment().getResourceName();

  // 1. Evaluate conversion success as a primary success signal if available.
  // - Point Estimate: Represents the estimated average lift or difference in conversions.
  // - Margin of Error: Outlines the confidence interval bounds. Note that the margin_of_error
  //   provided by the API is calculated for a preset confidence level which is set based on the
  //   experiment type.
  // - Lower Bound: (Point Estimate - Margin of Error). If this value is above 0,
  //   we have statistical significance that performance has improved.
  double convPValue = metrics.getConversionsAbsoluteChangePValue();
  double convLift = metrics.getConversionsAbsoluteChangePointEstimate();
  double convError = metrics.getConversionsAbsoluteChangeMarginOfError();
  double convLowerBound = convLift - convError;

  if (convPValue <= P_VALUE_THRESHOLD) {
    if (convLowerBound > 0) {
      System.out.printf(
          "Significant Success: Conversions increased. Even at the lower bound, the lift is %.2f."
              + " Promoting changes.%n",
          convLowerBound);
      promoteExperiment(googleAdsClient, customerId, experimentResourceName);
      return;
    } else if ((convLift + convError) < 0) {
      System.out.printf(
          "Significant Decline: Even the upper bound (%.2f) is below zero. Ending experiment.%n",
          convLift + convError);
      endExperiment(googleAdsClient, customerId, experimentResourceName);
      return;
    }
  }

  // 2. Fall back to evaluating click metrics if conversions are inconclusive.
  double clickPValue = metrics.getClicksPValue();
  double clickLift = metrics.getClicksPointEstimate();
  double clickError = metrics.getClicksMarginOfError();
  double clickLowerBound = clickLift - clickError;

  if (clickPValue <= P_VALUE_THRESHOLD && clickLowerBound > 0) {
    System.out.printf("Click volume is significantly up (+%.1f%%).%n", clickLift * 100);

    // Graduation is only supported for separate campaign experiments, not
    // intra-campaign experiments where there is no separate treatment campaign.
    ExperimentType experimentType = row.getExperiment().getType();
    if (experimentType != ExperimentType.ADOPT_BROAD_MATCH_KEYWORDS
        && experimentType != ExperimentType.ADOPT_AI_MAX) {
      System.out.println("Graduating treatment campaign for further manual analysis.");
      graduateExperiment(googleAdsClient, customerId, experimentResourceName);
    } else {
      System.out.println(
          "Intra-campaign trial detected: graduation is not supported. Continuing to run the"
              + " experiment to gather more conversion data.");
    }
  } else {
    // 3. Print status if no action was taken.
    System.out.printf(
        "Inconclusive: No significant lift in Conversions (p=%.2f) or Clicks (p=%.2f). Current"
            + " estimated lift: %.2f +/- %.2f. Allowing the experiment to continue running.%n",
        convPValue, clickPValue, convLift, convError);
  }
}
EvaluateAndUpdateExperiment.java

C#

private static void EvaluateExperiment(GoogleAdsClient client, long customerId, GoogleAdsRow row)
{
    // This function evaluates performance metrics and immediately takes action
    // to update the experiment's status (promote, end, or graduate) if
    // statistical significance thresholds are met.
    var metrics = row.Metrics;
    string experimentResourceName = row.Experiment.ResourceName;

    bool hasConvMetrics = metrics.HasConversionsAbsoluteChangePValue
        && metrics.HasConversionsAbsoluteChangePointEstimate
        && metrics.HasConversionsAbsoluteChangeMarginOfError;

    bool hasClickMetrics = metrics.HasClicksPValue
        && metrics.HasClicksPointEstimate
        && metrics.HasClicksMarginOfError;

    // 1. Evaluate conversion success as a primary success signal if available.
    // - Point Estimate: Represents the estimated average lift or difference in conversions.
    // - Margin of Error: Outlines the confidence interval bounds. Note that the margin_of_error
    //   provided by the API is calculated for a preset confidence level which is set based on
    //   the experiment type.
    // - Lower Bound: (Point Estimate - Margin of Error). If this value is above 0,
    //   we have statistical significance that performance has improved.
    if (hasConvMetrics)
    {
        double convPValue = metrics.ConversionsAbsoluteChangePValue;
        double convLift = metrics.ConversionsAbsoluteChangePointEstimate;
        double convError = metrics.ConversionsAbsoluteChangeMarginOfError;
        double convLowerBound = convLift - convError;

        if (convPValue <= P_VALUE_THRESHOLD)
        {
            if (convLowerBound > 0)
            {
                Console.WriteLine(
                    $"Significant Success: Conversions increased. Even at the lower" +
                    $" bound, the lift is {convLowerBound:F2}. Promoting changes.");
                PromoteExperiment(client, customerId, experimentResourceName);
                return;
            }
            else if ((convLift + convError) < 0)
            {
                Console.WriteLine(
                    $"Significant Decline: Even the upper bound ({convLift + convError:F2}) " +
                    $"is below zero. Ending experiment.");
                EndExperiment(client, customerId, experimentResourceName);
                return;
            }
        }
    }

    // 2. Evaluate click volume as a secondary signal.
    // This is helpful as an early indicator or for lower-volume accounts.
    if (hasClickMetrics)
    {
        double clickPValue = metrics.ClicksPValue;
        double clickLift = metrics.ClicksPointEstimate;
        double clickError = metrics.ClicksMarginOfError;
        double clickLowerBound = clickLift - clickError;

        if (clickPValue <= P_VALUE_THRESHOLD && clickLowerBound > 0)
        {
            // We have a directional winner: high confidence in more traffic,
            // but not enough data to confirm conversion impact yet.
            Console.WriteLine(
                $"Click volume is significantly up (+{clickLift * 100:F1}%).");

            // Graduation is only supported for separate campaign experiments, not
            // intra-campaign experiments where there is no separate treatment campaign.
            if (row.Experiment.Type != ExperimentType.AdoptBroadMatchKeywords
                && row.Experiment.Type != ExperimentType.AdoptAiMax)
            {
                Console.WriteLine("Graduating treatment campaign for further manual analysis.");
                GraduateExperiment(client, customerId, experimentResourceName);
            }
            else
            {
                Console.WriteLine(
                    "Intra-campaign trial detected: graduation is not supported. " +
                    "Continuing to run the experiment to gather more conversion data.");
            }
            return;
        }
    }

    // 3. Print status if no action was taken.
    if (hasConvMetrics || hasClickMetrics)
    {
        string convStatus = hasConvMetrics
            ? $"Conversions (p={metrics.ConversionsAbsoluteChangePValue:F2}, " +
              $"lift={metrics.ConversionsAbsoluteChangePointEstimate:F2} +/- " +
              $"{metrics.ConversionsAbsoluteChangeMarginOfError:F2})"
            : "Conversions (not populated)";

        string clickStatus = hasClickMetrics
            ? $"Clicks (p={metrics.ClicksPValue:F2}, " +
              $"lift={metrics.ClicksPointEstimate:F2} +/- " +
              $"{metrics.ClicksMarginOfError:F2})"
            : "Clicks (not populated)";

        Console.WriteLine(
            $"Inconclusive: No significant action taken. {convStatus}, {clickStatus}. " +
            "Allowing the experiment to continue running.");
    }
    else
    {
        Console.WriteLine(
            "Conversion and click performance metrics are not yet populated. " +
            "Allowing the experiment to continue running.");
    }
}EvaluateAndUpdateExperiment.cs

PHP

This example is not yet available in PHP; you can take a look at the other languages.

Python

def evaluate_experiment(
    client: GoogleAdsClient, customer_id: str, row: GoogleAdsRow
) -> None:
    """Evaluates the performance of the experiment and updates it accordingly
    (for example, promotes, ends, or graduates).

    Checks conversion and click metrics against statistical significance thresholds
    to determine the appropriate action to take on the experiment.

    Args:
        client: an initialized GoogleAdsClient instance.
        customer_id: a client customer ID.
        row: a GoogleAdsRow containing the experiment and metrics.
    """
    # This function evaluates performance metrics and immediately takes action
    # to update the experiment's status (promote, end, or graduate) if
    # statistical significance thresholds are met.
    metrics = row.metrics
    experiment_resource_name = row.experiment.resource_name

    has_conv_metrics = (
        "conversions_absolute_change_p_value" in metrics
        and "conversions_absolute_change_point_estimate" in metrics
        and "conversions_absolute_change_margin_of_error" in metrics
    )
    has_click_metrics = (
        "clicks_p_value" in metrics
        and "clicks_point_estimate" in metrics
        and "clicks_margin_of_error" in metrics
    )

    # 1. Evaluate conversion success as a primary success signal if available.
    # - Point Estimate: Represents the estimated average lift or difference in conversions.
    # - Margin of Error: Outlines the confidence interval bounds. Note that the margin_of_error provided by the API is calculated for a preset confidence level which is set based on the experiment type.
    # - Lower Bound: (Point Estimate - Margin of Error). If this value is above 0,
    #   we have statistical significance that performance has improved.
    if has_conv_metrics:
        conv_p_value = metrics.conversions_absolute_change_p_value
        conv_lift = metrics.conversions_absolute_change_point_estimate
        conv_error = metrics.conversions_absolute_change_margin_of_error
        conv_lower_bound = conv_lift - conv_error

        if conv_p_value <= P_VALUE_THRESHOLD:
            if conv_lower_bound > 0:
                print(
                    "Significant Success: Conversions increased. Even at the lower"
                    f" bound, the lift is {conv_lower_bound:.2f}. Promoting"
                    " changes."
                )
                promote_experiment(
                    client, customer_id, experiment_resource_name
                )
                return
            elif (conv_lift + conv_error) < 0:
                print(
                    "Significant Decline: Even the upper bound"
                    f" ({conv_lift + conv_error:.2f}) is below zero. Ending"
                    " experiment."
                )
                end_experiment(client, customer_id, experiment_resource_name)
                return

        # 2. Evaluate click volume as a secondary signal.
        # This is helpful as an early indicator or for lower-volume accounts.
        click_p_value = metrics.clicks_p_value
        click_lift = metrics.clicks_point_estimate
        click_error = metrics.clicks_margin_of_error
        click_lower_bound = click_lift - click_error

        if click_p_value <= P_VALUE_THRESHOLD and click_lower_bound > 0:
            # We have a directional winner: high confidence in more traffic,
            # but not enough data to confirm conversion impact yet.
            print(f"Click volume is significantly up (+{click_lift*100:.1f}%).")

            # Graduation is only supported for separate campaign experiments, not
            # intra-campaign experiments where there is no separate treatment campaign.
            experiment_type_name = row.experiment.type_.name
            if (
                experiment_type_name != "ADOPT_BROAD_MATCH_KEYWORDS"
                and experiment_type_name != "ADOPT_AI_MAX"
            ):
                print(
                    "Graduating treatment campaign for further manual analysis."
                )
                graduate_experiment(
                    client, customer_id, experiment_resource_name
                )
            else:
                print(
                    "Intra-campaign trial detected: graduation is not supported. "
                    "Continuing to run the experiment to gather more conversion data."
                )
            return

    # 3. Print status if no action was taken.
    if has_conv_metrics or has_click_metrics:
        conv_status = (
            f"Conversions (p={metrics.conversions_absolute_change_p_value:.2f}, "
            f"lift={metrics.conversions_absolute_change_point_estimate:.2f} +/- "
            f"{metrics.conversions_absolute_change_margin_of_error:.2f})"
            if has_conv_metrics
            else "Conversions (not populated)"
        )
        click_status = (
            f"Clicks (p={metrics.clicks_p_value:.2f}, "
            f"lift={metrics.clicks_point_estimate:.2f} +/- "
            f"{metrics.clicks_margin_of_error:.2f})"
            if has_click_metrics
            else "Clicks (not populated)"
        )
        print(
            f"Inconclusive: No significant action taken. {conv_status}, {click_status}."
            " Allowing the experiment to continue running."
        )
    else:
        print(
            "Conversion and click performance metrics are not yet populated. "
            "Allowing the experiment to continue running."
        )evaluate_and_update_experiment.py

Ruby

This example is not yet available in Ruby; you can take a look at the other languages.

Perl

This example is not yet available in Perl; you can take a look at the other languages.

curl

Lợi ích so với báo cáo về chiến dịch

Tính năng báo cáo trực tiếp về thử nghiệm mang lại một số lợi thế so với việc truy vấn riêng các báo cáo chiến dịch:

Chỉ số tập trung: Truy xuất các chỉ số cho nhóm đối chứng và nhóm thử nghiệm trong một hàng.
Dữ liệu về độ tin cậy thống kê: Cung cấp các giá trị p, điểm ước tính và giới hạn lỗi đã tính.
Hiệu quả: Không cần phải kết hợp hoặc so sánh kết quả từ nhiều báo cáo theo cách thủ công.
Hỗ trợ trong chiến dịch: Đây là cách duy nhất để so sánh nhóm đối chứng với nhóm thử nghiệm cho các thử nghiệm trong chiến dịch, trong đó lưu lượng truy cập được phân tách trong một chiến dịch.

Báo cáo về chiến dịch

Đối với các thử nghiệm tạo chiến dịch thử nghiệm riêng biệt (ví dụ: SEARCH_CUSTOM), bạn có thể truy vấn tài nguyên campaign và sử dụng campaign.experiment_type để xác định BASE (đối chứng) và EXPERIMENT (thử nghiệm) chiến dịch. Phương pháp này hữu ích nếu bạn cần phân khúc các chỉ số ở mức chi tiết hơn (ví dụ: theo nhóm quảng cáo hoặc từ khoá) hoặc xem siêu dữ liệu chiến dịch không có trên tài nguyên experiment. Tuy nhiên, bạn phải so sánh hiệu quả và tính toán thống kê theo cách thủ công.

Bạn không thể sử dụng tính năng báo cáo ở cấp chiến dịch để so sánh các nhóm thử nghiệm cho các thử nghiệm trong chiến dịch, vì mức phân tách lưu lượng truy cập được phân tách nội bộ trong một chiến dịch. Việc truy vấn campaign cho một thử nghiệm trong chiến dịch chỉ trả về tổng số đã tổng hợp.

Các phương pháp hay nhất

Chọn mức độ tin cậy phù hợp: Việc đặt ngưỡng giá trị p thấp hơn có thể giúp bạn nhanh chóng nhận được hướng dẫn theo hướng, đặc biệt là với ngân sách hoặc số lượt chuyển đổi thấp hơn. Độ tin cậy 95% (giá trị p <= 0,05) được coi là tiêu chuẩn học thuật và có thể phù hợp hơn để mang lại kết quả chính xác hơn trong khoảng thời gian dài hơn.
Chạy thử nghiệm đủ lâu: Chạy thử nghiệm trong ít nhất 4 tuần để tính đến các chu kỳ hiệu suất hàng tuần, độ trễ chuyển đổi và giai đoạn học máy.
Dành thời gian để tăng tốc: Đối với những chiến dịch sử dụng chiến lược đặt giá thầu tự động hoặc thử nghiệm các tính năng mới, hãy bỏ qua 1 đến 2 tuần dữ liệu đầu tiên để dành thời gian cho các mô hình đặt giá thầu và mức lưu lượng truy cập điều chỉnh lại theo mức phân tách.
Sử dụng mức phân tách 50/50: Mức phân tách lưu lượng truy cập 50/50 thường là cách nhanh nhất để đạt được kết quả có ý nghĩa thống kê.
Lên lịch trước: Đặt ngày bắt đầu thử nghiệm trước 3 đến 7 ngày trong tương lai để dành thời gian cho quy trình xem xét và phê duyệt quảng cáo.
Bạn chỉ có thể chạy mỗi lần một thử nghiệm cho mỗi chiến dịch.

Thử nghiệm kết hợp chiến dịch

Tiếp

Lỗi không đồng bộ

Báo cáo về thử nghiệm Sử dụng bộ sưu tập để sắp xếp ngăn nắp các trang Lưu và phân loại nội dung dựa trên lựa chọn ưu tiên của bạn.