拼多多大数据面试SQL-限时秒杀活动转化率
⚠️ 待修正
一、题目背景
这道题来自拼多多的数据分析岗面试。拼多多是社交裂变电商,数据分析师需要从海量业务数据中挖掘洞见。
业务场景:用户消费行为相关需求在日常工作中频繁出现,这类型的SQL题是面试高频考点。
一、题目
拼多多限时秒杀活动中,需要计算每个秒杀场次的用户转化漏斗:浏览→点击→下单→支付各环节的人数及转化率,找出转化率最低的环节。
假设有三张表:
flash_view:用户浏览秒杀会场次记录flash_click:用户点击商品记录flash_order:用户下单及支付记录
-- flash_view 浏览表
+---------+----------+-------------------+
| user_id | flash_id | view_time |
+---------+----------+-------------------+
| u01 | f001 | 2025-06-01 10:00 |
| u02 | f001 | 2025-06-01 10:05 |
| u03 | f001 | 2025-06-01 10:06 |
| u01 | f002 | 2025-06-01 12:00 |
| u02 | f002 | 2025-06-01 12:01 |
+---------+----------+-------------------+
-- flash_click 点击表
+---------+----------+-------------------+
| user_id | flash_id | click_time |
+---------+----------+-------------------+
| u01 | f001 | 2025-06-01 10:01 |
| u02 | f001 | 2025-06-01 10:06 |
| u03 | f001 | 2025-06-01 10:06 |
| u01 | f002 | 2025-06-01 12:01 |
+---------+----------+-------------------+
-- flash_order 下单支付表
+---------+----------+--------+-------------------+
| user_id | flash_id | status | order_time |
+---------+----------+--------+-------------------+
| u01 | f001 | pay | 2025-06-01 10:03 |
| u03 | f001 | order | 2025-06-01 10:07 |
+---------+----------+--------+-------------------+
三、思路分析
- 经典漏斗分析:分别统计浏览、点击、下单(含已支付)、支付四个环节的去重用户数;
- 使用
LEFT JOIN串联各环节数据,避免因某环节缺失导致漏斗断裂; - 站在每个闪购场次(flash_id)维度,计算相邻环节的转化率 = 后一环节人数 / 前一环节人数。
| 维度 | 评分 |
|---|---|
| 题目难度 | ⭐️⭐️⭐️ |
| 题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ |
| 业务常见度 | ⭐️⭐️⭐️⭐️⭐️ |
四、逐步推导
1.统计每个场次各环节的去重用户数
执行SQL
select v.flash_id,
count(distinct v.user_id) as view_uv,
count(distinct c.user_id) as click_uv,
count(distinct o.user_id) as order_uv,
count(distinct case when o.status = 'pay' then o.user_id end) as pay_uv
from flash_view v
left join flash_click c
on v.flash_id = c.flash_id and v.user_id = c.user_id
left join flash_order o
on v.flash_id = o.flash_id and v.user_id = o.user_id
group by v.flash_id
查询结果
+----------+---------+----------+----------+--------+
| flash_id | view_uv | click_uv | order_uv | pay_uv |
+----------+---------+----------+----------+--------+
| f001 | 3 | 3 | 2 | 1 |
| f002 | 2 | 1 | 0 | 0 |
+----------+---------+----------+----------+--------+
2.计算各环节转化率
执行SQL
select flash_id,
view_uv,
click_uv,
round(click_uv / view_uv, 4) as view_to_click_rate,
order_uv,
round(order_uv / click_uv, 4) as click_to_order_rate,
pay_uv,
round(pay_uv / order_uv, 4) as order_to_pay_rate
from (
select v.flash_id,
count(distinct v.user_id) as view_uv,
count(distinct c.user_id) as click_uv,
count(distinct o.user_id) as order_uv,
count(distinct case when o.status = 'pay' then o.user_id end) as pay_uv
from flash_view v
left join flash_click c
on v.flash_id = c.flash_id and v.user_id = c.user_id
left join flash_order o
on v.flash_id = o.flash_id and v.user_id = o.user_id
group by v.flash_id
) t
查询结果
+----------+---------+----------+-------------------+----------+-------------------+--------+------------------+
| flash_id | view_uv | click_uv | view_to_click_rate | order_uv | click_to_order_rate | pay_uv | order_to_pay_rate |
+----------+---------+----------+-------------------+----------+--------------------+--------+------------------+
| f001 | 3 | 3 | 1.0000 | 2 | 0.6667 | 1 | 0.5000 |
| f002 | 2 | 1 | 0.5000 | 0 | 0.0000 | 0 | NULL |
+----------+---------+----------+-------------------+----------+--------------------+--------+------------------+
五、常见坑点
坑1:各步骤COUNT DISTINCT口径需一致 — 如果时间窗口不同,转化率可能 >100%,出现诡异数据。
坑2:LEFT JOIN链过长导致性能爆炸 — 3-4层嵌套在大数据量下严重膨胀,可用 UNION ALL + CASE WHEN 替代。
六、举一反三
-
按骑手/站点/城市维度下钻:定位问题来源,是某个骑手还是某个区域的延迟率偏高
-
按延迟程度分级:CASE WHEN 分为轻微/中等/严重三级,关注严重延迟的占比
-
时段分析:按小时统计,识别午晚高峰时段的配送压力变化
七、知识点总结
| 考点 | 说明 |
|---|---|
| CASE WHEN 条件聚合 | 灵活实现分段统计、条件计数、标签化处理 |
| 多表JOIN | LEFT JOIN保留主表所有数据,COALESCE处理NULL |
| COUNT DISTINCT 去重 | 统计唯一用户/事件数,避免重复记录干扰聚合结果 |
| GROUP BY + 聚合函数 | 分组聚合是数据分析的基础,配合HAVING筛选分组结果 |
八、建表语句和数据插入
点击展开 DDL & DML
--建表语句
CREATE TABLE flash_view (
user_id string COMMENT '用户ID',
flash_id string COMMENT '秒杀场次ID',
view_time string COMMENT '浏览时间'
) COMMENT '秒杀浏览记录表';
CREATE TABLE flash_click (
user_id string COMMENT '用户ID',
flash_id string COMMENT '秒杀场次ID',
click_time string COMMENT '点击时间'
) COMMENT '秒杀点击记录表';
CREATE TABLE flash_order (
user_id string COMMENT '用户ID',
flash_id string COMMENT '秒杀场次ID',
status string COMMENT '订单状态:order-下单,pay-已支付',
order_time string COMMENT '下单/支付时间'
) COMMENT '秒杀订单记录表';
-- 插入数据
insert into flash_view(user_id, flash_id, view_time) values
('u01','f001','2025-06-01 10:00'),
('u02','f001','2025-06-01 10:05'),
('u03','f001','2025-06-01 10:06'),
('u01','f002','2025-06-01 12:00'),
('u02','f002','2025-06-01 12:01');
insert into flash_click(user_id, flash_id, click_time) values
('u01','f001','2025-06-01 10:01'),
('u02','f001','2025-06-01 10:06'),
('u03','f001','2025-06-01 10:06'),
('u01','f002','2025-06-01 12:01');
insert into flash_order(user_id, flash_id, status, order_time) values
('u01','f001','pay','2025-06-01 10:03'),
('u03','f001','order','2025-06-01 10:07');
📱关注公众号
「数据仓库技术」文章同步更新,不错过每一篇干货

💬加群交流
备注「数据仓库技术」加入社群,每日一道大厂SQL真题
