如需访问 Places Insights 数据,您需要在 BigQuery 中编写 SQL 查询,以返回有关地点的汇总分析洞见。结果会根据查询中指定的搜索条件从数据集中返回。
查询要求
查询中的 SELECT
语句必须包含 WITH AGGREGATION_THRESHOLD
并指定数据集。例如:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample`
在此示例中,您使用 FROM
指定美国所用的 places_insights___us___sample.places_sample
数据集。
指定位置限制
请注意,上一个查询未指定任何位置限制,这意味着数据汇总应用于整个数据集。通常,您会指定如下所示的位置限制:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE ST_DWITHIN(ST_GEOGPOINT(-73.9857, 40.7484), point, 1000)
在此示例中,您的查询指定了以纽约市帝国大厦为中心、半径为 1,000 米的目标限制。
您可以使用多边形指定搜索区域。使用多边形时,多边形的点必须定义一个闭环,其中多边形的第一个点与最后一个点相同:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE ST_CONTAINS(ST_GEOGFROMTEXT("""POLYGON((-73.985708 40.75773,-73.993324 40.750298, -73.9857 40.7484,-73.9785 40.7575, -73.985708 40.75773))"""), point)
在下一个示例中,您将使用一条由若干个相连的点组成的线来定义搜索区域。此线条类似于 Routes API 计算出的出行路线。路线可能适用于车辆、自行车或步行者:
DECLARE route GEOGRAPHY; SET route = ST_GEOGFROMTEXT("""LINESTRING(-73.98903537033028 40.73655649223003, -73.93580216278471 40.80955538843361)"""); SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE ST_DWITHIN(route, point, 100)
在此示例中,您将搜索半径设置为线路周围 100 米。
按地点数据集字段过滤
根据数据集架构定义的字段优化搜索。根据数据集字段(例如地点 regular_opening_hours
、price_level
和客户 rating
)过滤结果。
引用由您感兴趣的国家/地区的数据集架构定义的任何字段。每个国家/地区的数据集架构都包含两部分:
- 所有国家/地区的数据集共有的核心架构。
- 一种特定于国家/地区的架构,用于定义特定于相应国家/地区的架构组件。
例如,您的查询可以包含一个 WHERE
子句,用于定义查询的过滤条件。在以下示例中,您将返回类型为 restaurant
、business_status
为 OPERATIONAL
、rating
大于或等于 4.0 且 allows_dogs
设置为 true
的地点的汇总数据:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE ST_DWITHIN(ST_GEOGPOINT(-73.9857, 40.7484), point, 1000) AND 'restaurant' IN UNNEST(types) AND business_status = "OPERATIONAL" AND rating >= 4.0 AND allows_dogs = true
以下查询会返回至少有 8 个电动车辆充电站的地点:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE ev_charge_options.connector_count > 8;
按地点主要类型和地点类型进行过滤
数据集中的每个地点都可以有:
与表 A 定义的类型相关联的单个主要类型。例如,主要类型可能是
mexican_restaurant
或steak_house
。在查询中使用primary_type
可按地点的主要类型过滤结果。与 表 A 定义的类型关联的多个类型值。例如,某餐厅可能具有以下类型:
seafood_restaurant
、restaurant
、food
、point_of_interest
、establishment
。在查询中使用types
可根据与地点相关联的类型列表过滤结果。
以下查询会返回主要类型为 bar
但也可用作 restaurant
的所有地点的结果:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE 'restaurant' IN UNNEST(types) AND 'bar' = primary_type
按预定义的数据值过滤
许多数据集字段都具有预定义的值。例如
price_level
字段支持以下预定义值:PRICE_LEVEL_FREE
PRICE_LEVEL_INEXPENSIVE
PRICE_LEVEL_MODERATE
PRICE_LEVEL_EXPENSIVE
PRICE_LEVEL_VERY_EXPENSIVE
business_status
字段支持以下预定义值:OPERATIONAL
CLOSED_TEMPORARILY
CLOSED_PERMANENTLY
在此示例中,查询会返回纽约市帝国大厦方圆 1, 000 米内 business_status
为 OPERATIONAL
的所有餐厅的数量:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE ST_DWITHIN(ST_GEOGPOINT(-73.9857, 40.7484), point, 1000) AND business_status = "OPERATIONAL" AND 'restaurant' IN UNNEST(types)
按营业时间过滤
在此示例中,返回地理区域内所有提供周五欢乐时光的场所的数量:
SELECT WITH AGGREGATION_THRESHOLD COUNT(*) AS count FROM `places_insights___us___sample.places_sample`, UNNEST(regular_opening_hours_happy_hour.friday) AS friday_hours WHERE '17:00:00' BETWEEN friday_hours.start_time AND friday_hours.end_time AND ST_DWITHIN(ST_GEOGPOINT(-73.9857, 40.7484), point, 1000);
使用聚合函数
以下示例展示了受支持的 BigQuery 聚合函数。 此查询会汇总纽约市帝国大厦方圆 1,000 米内的所有地点的评分,以生成评分统计信息:
SELECT WITH AGGREGATION_THRESHOLD COUNT(id) AS place_count, APPROX_COUNT_DISTINCT(rating) as distinct_ratings, COUNTIF(rating > 4.0) as good_rating_count, LOGICAL_AND(rating <= 5) as all_ratings_equal_or_below_five, LOGICAL_OR(rating = 5) as any_rating_exactly_five, AVG(rating) as avg_rating, SUM(user_rating_count) as rating_count, COVAR_POP(rating, user_rating_count) as rating_covar_pop, COVAR_SAMP(rating, user_rating_count) as rating_covar_samp, STDDEV_POP(rating) as rating_stddev_pop, STDDEV_SAMP(rating) as rating_stddev_samp, VAR_POP(rating) as rating_var_pop, VAR_SAMP(rating) as rating_var_samp, FROM `places_insights___us___sample.places_sample` WHERE ST_DWITHIN(ST_GEOGPOINT(-73.9857, 40.7484), point, 1000) AND business_status = "OPERATIONAL"
返回结果组
到目前为止,我们展示的查询会在结果中返回一行,其中包含查询的汇总计数。您还可以使用 GROUP BY
运算符根据分组条件在响应中返回多行。
例如,以下查询返回的结果按搜索区域内每个地点的主要类型分组:
SELECT WITH AGGREGATION_THRESHOLD primary_type, COUNT(*) AS count FROM `places_insights___us___sample.places_sample` WHERE ST_DWITHIN(ST_GEOGPOINT(-73.99992071622756, 40.71818785986936), point, 1000) GROUP BY primary_type
下图显示了此查询的示例输出:
在此示例中,您定义了一个位置表。然后,针对每个位置,计算附近餐厅的数量(即 1, 000 米范围内的餐厅):
WITH my_locations AS ( SELECT 'Location 1' AS name, ST_GEOGPOINT(-74.00776440888504, 40.70932825380786) AS location UNION ALL SELECT 'Location 2' AS name, ST_GEOGPOINT(-73.98257192833559, 40.750738934863215) AS location UNION ALL SELECT 'Location 3' AS name, ST_GEOGPOINT(-73.94701794263223, 40.80792954838445) AS location ) SELECT WITH AGGREGATION_THRESHOLD l.name, COUNT(*) as count FROM `places_insights___us___sample.places_sample` p JOIN my_locations l ON ST_DWITHIN(l.location, p.point, 1000) WHERE primary_type = "restaurant" AND business_status = "OPERATIONAL" GROUP BY l.name
下图显示了此查询的示例输出: