Stay organized with collections
Save and categorize content based on your preferences.
The advanced queries in this page apply to the BigQuery event export data for
Google Analytics. See BigQuery cookbook for Universal Analytics if you are
looking for the same resource for Universal Analytics. Try the basic queries
first before trying out the advanced ones.
Products purchased by customers who purchased a certain product
The following query shows what other products were purchased by customers who
purchased a specific product. This example does not assume that the products
were purchased in the same order.
The optimized example relies on BigQuery scripting features to define a variable
that declares which items to filter on. While this does not improve performance,
this is a more readable approach for defining variables compared creating a
single value table using a WITH clause. The simplified query uses the latter
approach using the WITH clause.
The simplified query creats a separate list of "Product A buyers" and does a
join with that data. The optimized query, instead, creates a list of all items a
user has purchased across orders using the ARRAY_AGG function. Then using the
outer WHERE clause, purchase lists across all users are filtered for the
target_item and only relevant items are shown.
Simplified
-- Example: Products purchased by customers who purchased a specific product.---- `Params` is used to hold the value of the selected product and is referenced-- throughout the query.WITHParamsAS(-- Replace with selected item_name or item_id.SELECT'Google Navy Speckled Tee'ASselected_product),PurchaseEventsAS(SELECTuser_pseudo_id,itemsFROM-- Replace table name.`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`WHERE-- Replace date range._TABLE_SUFFIXBETWEEN'20201101'AND'20210131'ANDevent_name='purchase'),ProductABuyersAS(SELECTDISTINCTuser_pseudo_idFROMParams,PurchaseEvents,UNNEST(items)ASitemsWHERE-- item.item_id can be used instead of items.item_name.items.item_name=selected_product)SELECTitems.item_nameASitem_name,SUM(items.quantity)ASitem_quantityFROMParams,PurchaseEvents,UNNEST(items)ASitemsWHEREuser_pseudo_idIN(SELECTuser_pseudo_idFROMProductABuyers)-- item.item_id can be used instead of items.item_nameANDitems.item_name!=selected_productGROUPBY1ORDERBYitem_quantityDESC;
Optimized
-- Optimized Example: Products purchased by customers who purchased a specific product.-- Replace item nameDECLAREtarget_itemSTRINGDEFAULT'Google Navy Speckled Tee';SELECTIL.item_nameASitem_name,SUM(IL.quantity)ASquantityFROM(SELECTuser_pseudo_id,ARRAY_AGG(STRUCT(item_name,quantity))ASitem_listFROM-- Replace table`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`,UNNEST(items)WHERE-- Replace date range_TABLE_SUFFIXBETWEEN'20201201'AND'20201210'ANDevent_name='purchase'GROUPBY1),UNNEST(item_list)ASILWHEREtarget_itemIN(SELECTitem_nameFROMUNNEST(item_list))-- Remove the following line if you want the target_item to appear in the resultsANDtarget_item!=IL.item_nameGROUPBYitem_nameORDERBYquantityDESC;
Average amount of money spent per purchase session by user
The following query shows the average amount of money spent per session by each
user. This takes into account only the sessions where the user made a purchase.
-- Example: Average amount of money spent per purchase session by user.WITHeventsAS(SELECTsession.value.int_valueASsession_id,COALESCE(spend.value.int_value,spend.value.float_value,spend.value.double_value,0.0)ASspend_value,event.*-- Replace table nameFROM`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`ASeventLEFTJOINUNNEST(event.event_params)ASsessionONsession.key='ga_session_id'LEFTJOINUNNEST(event.event_params)ASspendONspend.key='value'-- Replace date rangeWHERE_TABLE_SUFFIXBETWEEN'20201101'AND'20210131')SELECTuser_pseudo_id,COUNT(DISTINCTsession_id)ASsession_count,SUM(spend_value)/COUNT(DISTINCTsession_id)ASavg_spend_per_session_by_userFROMeventsWHEREevent_name='purchase'andsession_idISNOTNULLGROUPBYuser_pseudo_id
Latest Session Id and Session Number for users
The following query provides the list of the latest ga_session_id and
ga_session_number from last 4 days for a list of users. You can provide either a
user_pseudo_id list or a user_id list.
user_pseudo_id
-- Get the latest ga_session_id and ga_session_number for specific users during last 4 days.-- Replace timezone. List at https://en.wikipedia.org/wiki/List_of_tz_database_time_zones.DECLAREREPORTING_TIMEZONESTRINGDEFAULT'America/Los_Angeles';-- Replace list of user_pseudo_id's with ones you want to query.DECLAREUSER_PSEUDO_ID_LISTARRAY<STRING>DEFAULT['1005355938.1632145814','979622592.1632496588','1101478530.1632831095'];CREATETEMPFUNCTIONGetParamValue(paramsANYTYPE,target_keySTRING)AS((SELECT`value`FROMUNNEST(params)WHEREkey=target_keyLIMIT1));CREATETEMPFUNCTIONGetDateSuffix(date_shiftINT64,timezoneSTRING)AS((SELECTFORMAT_DATE('%Y%m%d',DATE_ADD(CURRENT_DATE(timezone),INTERVALdate_shiftDAY))));SELECTDISTINCTuser_pseudo_id,FIRST_VALUE(GetParamValue(event_params,'ga_session_id').int_value)OVER(UserWindow)ASga_session_id,FIRST_VALUE(GetParamValue(event_params,'ga_session_number').int_value)OVER(UserWindow)ASga_session_numberFROM-- Replace table name.`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`WHEREuser_pseudo_idINUNNEST(USER_PSEUDO_ID_LIST)ANDRIGHT(_TABLE_SUFFIX,8)BETWEENGetDateSuffix(-3,REPORTING_TIMEZONE)ANDGetDateSuffix(0,REPORTING_TIMEZONE)WINDOWUserWindowAS(PARTITIONBYuser_pseudo_idORDERBYevent_timestampDESC);
user_id
-- Get the latest ga_session_id and ga_session_number for specific users during last 4 days.-- Replace timezone. List at https://en.wikipedia.org/wiki/List_of_tz_database_time_zones.DECLAREREPORTING_TIMEZONESTRINGDEFAULT'America/Los_Angeles';-- Replace list of user_id's with ones you want to query.DECLAREUSER_ID_LISTARRAY<STRING>DEFAULT['<user_id_1>','<user_id_2>','<user_id_n>'];CREATETEMPFUNCTIONGetParamValue(paramsANYTYPE,target_keySTRING)AS((SELECT`value`FROMUNNEST(params)WHEREkey=target_keyLIMIT1));CREATETEMPFUNCTIONGetDateSuffix(date_shiftINT64,timezoneSTRING)AS((SELECTFORMAT_DATE('%Y%m%d',DATE_ADD(CURRENT_DATE(timezone),INTERVALdate_shiftDAY))));SELECTDISTINCTuser_pseudo_id,FIRST_VALUE(GetParamValue(event_params,'ga_session_id').int_value)OVER(UserWindow)ASga_session_id,FIRST_VALUE(GetParamValue(event_params,'ga_session_number').int_value)OVER(UserWindow)ASga_session_numberFROM-- Replace table name.`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`WHEREuser_idINUNNEST(USER_ID_LIST)ANDRIGHT(_TABLE_SUFFIX,8)BETWEENGetDateSuffix(-3,REPORTING_TIMEZONE)ANDGetDateSuffix(0,REPORTING_TIMEZONE)WINDOWUserWindowAS(PARTITIONBYuser_pseudo_idORDERBYevent_timestampDESC);
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-10-09 UTC."],[[["\u003cp\u003eThis page provides advanced BigQuery queries for analyzing Google Analytics 4 event export data, going beyond basic queries.\u003c/p\u003e\n"],["\u003cp\u003eIt includes queries to identify products frequently purchased together, calculate average spending per purchase session, and retrieve the latest session information for specific users.\u003c/p\u003e\n"],["\u003cp\u003eThe queries are demonstrated with examples and explanations, including simplified and optimized versions where applicable.\u003c/p\u003e\n"],["\u003cp\u003eBefore using these advanced queries, it's recommended to familiarize yourself with the basic BigQuery queries for Google Analytics 4.\u003c/p\u003e\n"],["\u003cp\u003eUsers of Universal Analytics can find similar resources in the BigQuery cookbook for Universal Analytics linked on the page.\u003c/p\u003e\n"]]],["This document provides advanced BigQuery queries for Google Analytics event data. It details how to identify other products purchased by customers who bought a specific item, offering both simplified and optimized query examples that filter purchase lists. Another query calculates the average amount spent per purchase session per user. Lastly, it outlines how to retrieve the latest session ID and number for users, with examples for both `user_pseudo_id` and `user_id` lists.\n"],null,["The advanced queries in this page apply to the BigQuery event export data for\nGoogle Analytics. See [BigQuery cookbook for Universal Analytics](https://support.google.com/analytics/answer/4419694) if you are\nlooking for the same resource for Universal Analytics. Try the [basic queries](/analytics/bigquery/basic-queries)\nfirst before trying out the advanced ones.\n\nProducts purchased by customers who purchased a certain product\n\nThe following query shows what other products were purchased by customers who\npurchased a specific product. This example does not assume that the products\nwere purchased in the same order.\n\nThe optimized example relies on BigQuery scripting features to define a variable\nthat declares which items to filter on. While this does not improve performance,\nthis is a more readable approach for defining variables compared creating a\nsingle value table using a `WITH` clause. The simplified query uses the latter\napproach using the `WITH` clause.\n\nThe simplified query creats a separate list of \"Product A buyers\" and does a\njoin with that data. The optimized query, instead, creates a list of all items a\nuser has purchased across orders using the `ARRAY_AGG` function. Then using the\nouter `WHERE` clause, purchase lists across all users are filtered for the\n`target_item` and only relevant items are shown. \n\nSimplified \n\n -- Example: Products purchased by customers who purchased a specific product.\n --\n -- `Params` is used to hold the value of the selected product and is referenced\n -- throughout the query.\n\n WITH\n Params AS (\n -- Replace with selected item_name or item_id.\n SELECT 'Google Navy Speckled Tee' AS selected_product\n ),\n PurchaseEvents AS (\n SELECT\n user_pseudo_id,\n items\n FROM\n -- Replace table name.\n `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`\n WHERE\n -- Replace date range.\n _TABLE_SUFFIX BETWEEN '20201101' AND '20210131'\n AND event_name = 'purchase'\n ),\n ProductABuyers AS (\n SELECT DISTINCT\n user_pseudo_id\n FROM\n Params,\n PurchaseEvents,\n UNNEST(items) AS items\n WHERE\n -- item.item_id can be used instead of items.item_name.\n items.item_name = selected_product\n )\n SELECT\n items.item_name AS item_name,\n SUM(items.quantity) AS item_quantity\n FROM\n Params,\n PurchaseEvents,\n UNNEST(items) AS items\n WHERE\n user_pseudo_id IN (SELECT user_pseudo_id FROM ProductABuyers)\n -- item.item_id can be used instead of items.item_name\n AND items.item_name != selected_product\n GROUP BY 1\n ORDER BY item_quantity DESC;\n\nOptimized \n\n -- Optimized Example: Products purchased by customers who purchased a specific product.\n\n -- Replace item name\n DECLARE target_item STRING DEFAULT 'Google Navy Speckled Tee';\n\n SELECT\n IL.item_name AS item_name,\n SUM(IL.quantity) AS quantity\n FROM\n (\n SELECT\n user_pseudo_id,\n ARRAY_AGG(STRUCT(item_name, quantity)) AS item_list\n FROM\n -- Replace table\n `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`, UNNEST(items)\n WHERE\n -- Replace date range\n _TABLE_SUFFIX BETWEEN '20201201' AND '20201210'\n AND event_name = 'purchase'\n GROUP BY\n 1\n ),\n UNNEST(item_list) AS IL\n WHERE\n target_item IN (SELECT item_name FROM UNNEST(item_list))\n -- Remove the following line if you want the target_item to appear in the results\n AND target_item != IL.item_name\n GROUP BY\n item_name\n ORDER BY\n quantity DESC;\n\nAverage amount of money spent per purchase session by user\n\nThe following query shows the average amount of money spent per session by each\nuser. This takes into account only the sessions where the user made a purchase. \n\n -- Example: Average amount of money spent per purchase session by user.\n\n WITH\n events AS (\n SELECT\n session.value.int_value AS session_id,\n COALESCE(spend.value.int_value, spend.value.float_value, spend.value.double_value, 0.0)\n AS spend_value,\n event.*\n\n -- Replace table name\n FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` AS event\n LEFT JOIN UNNEST(event.event_params) AS session\n ON session.key = 'ga_session_id'\n LEFT JOIN UNNEST(event.event_params) AS spend\n ON spend.key = 'value'\n\n -- Replace date range\n WHERE _TABLE_SUFFIX BETWEEN '20201101' AND '20210131'\n )\n SELECT\n user_pseudo_id,\n COUNT(DISTINCT session_id) AS session_count,\n SUM(spend_value) / COUNT(DISTINCT session_id) AS avg_spend_per_session_by_user\n FROM events\n WHERE event_name = 'purchase' and session_id IS NOT NULL\n GROUP BY user_pseudo_id\n\nLatest Session Id and Session Number for users\n\nThe following query provides the list of the latest ga_session_id and\nga_session_number from last 4 days for a list of users. You can provide either a\n`user_pseudo_id` list or a `user_id` list. \n\nuser_pseudo_id \n\n -- Get the latest ga_session_id and ga_session_number for specific users during last 4 days.\n\n -- Replace timezone. List at https://en.wikipedia.org/wiki/List_of_tz_database_time_zones.\n DECLARE REPORTING_TIMEZONE STRING DEFAULT 'America/Los_Angeles';\n\n -- Replace list of user_pseudo_id's with ones you want to query.\n DECLARE USER_PSEUDO_ID_LIST ARRAY\u003cSTRING\u003e DEFAULT\n [\n '1005355938.1632145814', '979622592.1632496588', '1101478530.1632831095'];\n\n CREATE TEMP FUNCTION GetParamValue(params ANY TYPE, target_key STRING)\n AS (\n (SELECT `value` FROM UNNEST(params) WHERE key = target_key LIMIT 1)\n );\n\n CREATE TEMP FUNCTION GetDateSuffix(date_shift INT64, timezone STRING)\n AS (\n (SELECT FORMAT_DATE('%Y%m%d', DATE_ADD(CURRENT_DATE(timezone), INTERVAL date_shift DAY)))\n );\n\n SELECT DISTINCT\n user_pseudo_id,\n FIRST_VALUE(GetParamValue(event_params, 'ga_session_id').int_value)\n OVER (UserWindow) AS ga_session_id,\n FIRST_VALUE(GetParamValue(event_params, 'ga_session_number').int_value)\n OVER (UserWindow) AS ga_session_number\n FROM\n -- Replace table name.\n `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`\n WHERE\n user_pseudo_id IN UNNEST(USER_PSEUDO_ID_LIST)\n AND RIGHT(_TABLE_SUFFIX, 8)\n BETWEEN GetDateSuffix(-3, REPORTING_TIMEZONE)\n AND GetDateSuffix(0, REPORTING_TIMEZONE)\n WINDOW UserWindow AS (PARTITION BY user_pseudo_id ORDER BY event_timestamp DESC);\n\nuser_id \n\n -- Get the latest ga_session_id and ga_session_number for specific users during last 4 days.\n\n -- Replace timezone. List at https://en.wikipedia.org/wiki/List_of_tz_database_time_zones.\n DECLARE REPORTING_TIMEZONE STRING DEFAULT 'America/Los_Angeles';\n\n -- Replace list of user_id's with ones you want to query.\n DECLARE USER_ID_LIST ARRAY\u003cSTRING\u003e DEFAULT ['\u003cuser_id_1\u003e', '\u003cuser_id_2\u003e', '\u003cuser_id_n\u003e'];\n\n CREATE TEMP FUNCTION GetParamValue(params ANY TYPE, target_key STRING)\n AS (\n (SELECT `value` FROM UNNEST(params) WHERE key = target_key LIMIT 1)\n );\n\n CREATE TEMP FUNCTION GetDateSuffix(date_shift INT64, timezone STRING)\n AS (\n (SELECT FORMAT_DATE('%Y%m%d', DATE_ADD(CURRENT_DATE(timezone), INTERVAL date_shift DAY)))\n );\n\n SELECT DISTINCT\n user_pseudo_id,\n FIRST_VALUE(GetParamValue(event_params, 'ga_session_id').int_value)\n OVER (UserWindow) AS ga_session_id,\n FIRST_VALUE(GetParamValue(event_params, 'ga_session_number').int_value)\n OVER (UserWindow) AS ga_session_number\n FROM\n -- Replace table name.\n `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`\n WHERE\n user_id IN UNNEST(USER_ID_LIST)\n AND RIGHT(_TABLE_SUFFIX, 8)\n BETWEEN GetDateSuffix(-3, REPORTING_TIMEZONE)\n AND GetDateSuffix(0, REPORTING_TIMEZONE)\n WINDOW UserWindow AS (PARTITION BY user_pseudo_id ORDER BY event_timestamp DESC);"]]