腾讯大数据面试SQL-向用户推荐好友喜欢的音乐
一、题目
现有三张表分别为:
用户关注表t1_follow(user_id,follower_id)记录用户ID及其关注的人ID,请给用户1推荐他关注的用户喜欢的音乐名称
+----------+--------------+
| user_id  | follower_id  |
+----------+--------------+
| 1        | 2            |
| 1        | 4            |
| 1        | 5            |
+----------+--------------+用户喜欢的音乐t1_music_likes(user_id,music_id)
+----------+-----------+
| user_id  | music_id  |
+----------+-----------+
| 1        | 10        |
| 2        | 20        |
| 2        | 30        |
| 3        | 20        |
| 3        | 30        |
| 4        | 40        |
| 4        | 50        |
+----------+-----------+音乐名字表t1_music(music_id,music_name)
+-----------+-------------+
| music_id  | music_name  |
+-----------+-------------+
| 10        | a           |
| 20        | b           |
| 30        | c           |
| 40        | d           |
| 50        | e           |
+-----------+-------------+二、分析
本题要给用户1推荐其关注的用户喜欢的音乐名称,主要是考察表之间的关联,并考察行转列及去重相关操作;
1.根据用户关注表和用户喜欢的音乐表进行关联,查询出每个用户喜欢的音乐ID;
2.再关联音乐名字表,关联出对应的音乐名称;
3.行转列并对重复的音乐名称去重,得到最终结果
| 维度 | 评分 | 
|---|---|
| 题目难度 | ⭐️⭐️⭐️ | 
| 题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ | 
| 业务常见度 | ⭐️⭐️⭐️⭐️⭐️ | 
三、SQL
1.根据用户关注表和用户喜欢的音乐表进行关联,查询出每个用户关注用户喜欢的音乐ID;
执行SQL
select t1.user_id,
       t1.follower_id,
       t2.music_id
from (select user_id,
             follower_id
      from t1_follow
      where user_id = 1) t1
         left join
     (select user_id,
             music_id
      from t1_music_likes) t2
     on t1.follower_id = t2.user_id执行结果
+-------------+-----------------+--------------+
| t1.user_id  | t1.follower_id  | t2.music_id  |
+-------------+-----------------+--------------+
| 1           | 2               | 20           |
| 1           | 2               | 30           |
| 1           | 4               | 40           |
| 1           | 4               | 50           |
| 1           | 5               | NULL         |
+-------------+-----------------+--------------+2.关联音乐名字表,关联出对应的音乐名称;
执行SQL
select t1.user_id,
       t1.follower_id,
       t2.music_id,
       t3.music_name
from (select user_id,
             follower_id
      from t1_follow
      where user_id = 1) t1
         left join
     (select user_id,
             music_id
      from t1_music_likes) t2
     on t1.follower_id = t2.user_id
         left join
     (select music_id,
             music_name
      from t1_music) t3
     on t2.music_id = t3.music_id执行结果
+-------------+-----------------+--------------+----------------+
| t1.user_id  | t1.follower_id  | t2.music_id  | t3.music_name  |
+-------------+-----------------+--------------+----------------+
| 1           | 2               | 20           | b              |
| 1           | 2               | 30           | c              |
| 1           | 4               | 40           | d              |
| 1           | 4               | 50           | e              |
| 1           | 5               | NULL         | NULL           |
+-------------+-----------------+--------------+----------------+3.行转列并对重复的音乐名称去重,得到最终结果
行转列并对重复的音乐名称去重,得到最终结果。行转列使用聚合函数collect_set()函数,然后使用concat_ws转成字符串
执行SQL
select t1.user_id,
       concat_ws(',', collect_set(t3.music_name)) as push_music
from (select user_id,
             follower_id
      from t1_follow
      where user_id = 1) t1
         left join
     (select user_id,
             music_id
      from t1_music_likes) t2
     on t1.follower_id = t2.user_id
         left join
     (select music_id,
             music_name
      from t1_music) t3
     on t2.music_id = t3.music_id
group by t1.user_id执行结果
+-------------+-------------+
| t1.user_id  | push_music  |
+-------------+-------------+
| 1           | b,c,d,e     |
+-------------+-------------+四、建表语句和数据插入
--建表语句
CREATE TABLE t1_follow (
user_id bigint COMMENT '用户ID',
follower_id bigint COMMENT '关注用户ID'
) COMMENT '用户关注表';
-- 插入数据
insert into t1_follow(user_id,follower_id)
values
(1,2),
(1,4),
(1,5)
;
-- 建表语句
CREATE TABLE t1_music_likes (
user_id bigint COMMENT '用户ID',
music_id bigint COMMENT '音乐ID'
) COMMENT '用户喜欢音乐ID';
--插入语句
insert into t1_music_likes(user_id,music_id)
values
(1,10),
(2,20),
(2,30),
(3,20),
(3,30),
(4,40),
(4,50)
;
--建表语句
CREATE TABLE t1_music (
music_id bigint COMMENT '音乐ID',
music_name string COMMENT '音乐名称'
) COMMENT '音乐名字表';
-- 插入语句
insert into t1_music(music_id,music_name)
values
(10,'a'),
(20,'b'),
(30,'c'),
(40,'d'),
(50,'e')
;本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术 (opens in a new tab)“发表;