跳到主要内容

拼多多大数据面试SQL-限时秒杀活动转化率

⚠️ 待修正

一、题目背景

这道题来自拼多多的数据分析岗面试。拼多多是社交裂变电商,数据分析师需要从海量业务数据中挖掘洞见。

业务场景:用户消费行为相关需求在日常工作中频繁出现,这类型的SQL题是面试高频考点。

一、题目

拼多多限时秒杀活动中,需要计算每个秒杀场次的用户转化漏斗:浏览→点击→下单→支付各环节的人数及转化率,找出转化率最低的环节。

假设有三张表:

  • flash_view:用户浏览秒杀会场次记录
  • flash_click:用户点击商品记录
  • flash_order:用户下单及支付记录
-- flash_view 浏览表
+---------+----------+-------------------+
| user_id | flash_id | view_time |
+---------+----------+-------------------+
| u01 | f001 | 2025-06-01 10:00 |
| u02 | f001 | 2025-06-01 10:05 |
| u03 | f001 | 2025-06-01 10:06 |
| u01 | f002 | 2025-06-01 12:00 |
| u02 | f002 | 2025-06-01 12:01 |
+---------+----------+-------------------+

-- flash_click 点击表
+---------+----------+-------------------+
| user_id | flash_id | click_time |
+---------+----------+-------------------+
| u01 | f001 | 2025-06-01 10:01 |
| u02 | f001 | 2025-06-01 10:06 |
| u03 | f001 | 2025-06-01 10:06 |
| u01 | f002 | 2025-06-01 12:01 |
+---------+----------+-------------------+

-- flash_order 下单支付表
+---------+----------+--------+-------------------+
| user_id | flash_id | status | order_time |
+---------+----------+--------+-------------------+
| u01 | f001 | pay | 2025-06-01 10:03 |
| u03 | f001 | order | 2025-06-01 10:07 |
+---------+----------+--------+-------------------+

三、思路分析

  1. 经典漏斗分析:分别统计浏览、点击、下单(含已支付)、支付四个环节的去重用户数;
  2. 使用 LEFT JOIN 串联各环节数据,避免因某环节缺失导致漏斗断裂;
  3. 站在每个闪购场次(flash_id)维度,计算相邻环节的转化率 = 后一环节人数 / 前一环节人数。
维度评分
题目难度⭐️⭐️⭐️
题目清晰度⭐️⭐️⭐️⭐️⭐️
业务常见度⭐️⭐️⭐️⭐️⭐️

四、逐步推导

1.统计每个场次各环节的去重用户数

执行SQL

select v.flash_id,
count(distinct v.user_id) as view_uv,
count(distinct c.user_id) as click_uv,
count(distinct o.user_id) as order_uv,
count(distinct case when o.status = 'pay' then o.user_id end) as pay_uv
from flash_view v
left join flash_click c
on v.flash_id = c.flash_id and v.user_id = c.user_id
left join flash_order o
on v.flash_id = o.flash_id and v.user_id = o.user_id
group by v.flash_id

查询结果

+----------+---------+----------+----------+--------+
| flash_id | view_uv | click_uv | order_uv | pay_uv |
+----------+---------+----------+----------+--------+
| f001 | 3 | 3 | 2 | 1 |
| f002 | 2 | 1 | 0 | 0 |
+----------+---------+----------+----------+--------+

2.计算各环节转化率

执行SQL

select flash_id,
view_uv,
click_uv,
round(click_uv / view_uv, 4) as view_to_click_rate,
order_uv,
round(order_uv / click_uv, 4) as click_to_order_rate,
pay_uv,
round(pay_uv / order_uv, 4) as order_to_pay_rate
from (
select v.flash_id,
count(distinct v.user_id) as view_uv,
count(distinct c.user_id) as click_uv,
count(distinct o.user_id) as order_uv,
count(distinct case when o.status = 'pay' then o.user_id end) as pay_uv
from flash_view v
left join flash_click c
on v.flash_id = c.flash_id and v.user_id = c.user_id
left join flash_order o
on v.flash_id = o.flash_id and v.user_id = o.user_id
group by v.flash_id
) t

查询结果

+----------+---------+----------+-------------------+----------+-------------------+--------+------------------+
| flash_id | view_uv | click_uv | view_to_click_rate | order_uv | click_to_order_rate | pay_uv | order_to_pay_rate |
+----------+---------+----------+-------------------+----------+--------------------+--------+------------------+
| f001 | 3 | 3 | 1.0000 | 2 | 0.6667 | 1 | 0.5000 |
| f002 | 2 | 1 | 0.5000 | 0 | 0.0000 | 0 | NULL |
+----------+---------+----------+-------------------+----------+--------------------+--------+------------------+

五、常见坑点

坑1:各步骤COUNT DISTINCT口径需一致 — 如果时间窗口不同,转化率可能 >100%,出现诡异数据。

坑2:LEFT JOIN链过长导致性能爆炸 — 3-4层嵌套在大数据量下严重膨胀,可用 UNION ALL + CASE WHEN 替代。

六、举一反三

  1. 按骑手/站点/城市维度下钻:定位问题来源,是某个骑手还是某个区域的延迟率偏高

  2. 按延迟程度分级:CASE WHEN 分为轻微/中等/严重三级,关注严重延迟的占比

  3. 时段分析:按小时统计,识别午晚高峰时段的配送压力变化

七、知识点总结

考点说明
CASE WHEN 条件聚合灵活实现分段统计、条件计数、标签化处理
多表JOINLEFT JOIN保留主表所有数据,COALESCE处理NULL
COUNT DISTINCT 去重统计唯一用户/事件数,避免重复记录干扰聚合结果
GROUP BY + 聚合函数分组聚合是数据分析的基础,配合HAVING筛选分组结果

八、建表语句和数据插入

点击展开 DDL & DML
--建表语句
CREATE TABLE flash_view (
user_id string COMMENT '用户ID',
flash_id string COMMENT '秒杀场次ID',
view_time string COMMENT '浏览时间'
) COMMENT '秒杀浏览记录表';

CREATE TABLE flash_click (
user_id string COMMENT '用户ID',
flash_id string COMMENT '秒杀场次ID',
click_time string COMMENT '点击时间'
) COMMENT '秒杀点击记录表';

CREATE TABLE flash_order (
user_id string COMMENT '用户ID',
flash_id string COMMENT '秒杀场次ID',
status string COMMENT '订单状态:order-下单,pay-已支付',
order_time string COMMENT '下单/支付时间'
) COMMENT '秒杀订单记录表';

-- 插入数据
insert into flash_view(user_id, flash_id, view_time) values
('u01','f001','2025-06-01 10:00'),
('u02','f001','2025-06-01 10:05'),
('u03','f001','2025-06-01 10:06'),
('u01','f002','2025-06-01 12:00'),
('u02','f002','2025-06-01 12:01');

insert into flash_click(user_id, flash_id, click_time) values
('u01','f001','2025-06-01 10:01'),
('u02','f001','2025-06-01 10:06'),
('u03','f001','2025-06-01 10:06'),
('u01','f002','2025-06-01 12:01');

insert into flash_order(user_id, flash_id, status, order_time) values
('u01','f001','pay','2025-06-01 10:03'),
('u03','f001','order','2025-06-01 10:07');
📱关注公众号

「数据仓库技术」文章同步更新,不错过每一篇干货

微信公众号二维码
💬加群交流

备注「数据仓库技术」加入社群,每日一道大厂SQL真题

交流微信二维码

你可能还想看