归因分析 (Attribution Modeling) 指南


快速选型

归因模型权重分配适合场景复杂度
首次触达 (First-Touch)首个触点 100%品牌认知、拉新评估⭐⭐
末次触达 (Last-Touch)末个触点 100%效果广告、转化优化⭐⭐
线性 (Linear)所有触点平分公平评估、初步分析⭐⭐
时间衰减 (Time-Decay)越近转化权重越高短周期决策、促销驱动⭐⭐⭐
位置 (U-Shape)首 40% + 末 40% + 中间平分全链路评估、标准报告⭐⭐⭐

测试数据

本文所有示例基于以下用户触点路径:

-- 用户触点表 CREATE TABLE marketing_touchpoints ( user_id BIGINT, event_time TIMESTAMP_LTZ, channel VARCHAR, -- organic_search, paid_ads, social_media, email, direct event_type VARCHAR -- view, click, purchase ); -- 转化金额表(实际生产中来自订单系统) CREATE TABLE conversions ( user_id BIGINT, conversion_value DECIMAL(10,2) ); -- 时间衰减归因额外需要转化时间(可从订单表获取) -- Time-Decay 模型需要在 conversions 中增加 conversion_time 列, -- 或在查询中通过 JOIN 订单表获取

模拟 4 个用户的触点路径:

用户触点路径转化金额
1organic_search → social_media → paid_ads → purchase$100
2paid_ads → email → purchase$200
3social_media → organic_search → social_media → paid_ads → purchase$150
4email → purchase$50

1. 首次触达归因 (First-Touch)

规则

转化金额 100% 归因给用户旅程中的 第一个触点

SQL 实现

WITH touchpoints_ranked AS ( SELECT user_id, channel, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_time) AS touch_order FROM marketing_touchpoints WHERE event_type != 'purchase' ), first_channel AS ( SELECT user_id, channel AS first_touch_channel FROM touchpoints_ranked WHERE touch_order = 1 ) SELECT fc.first_touch_channel, COUNT(*) AS conversions, SUM(c.conversion_value) AS total_value, ROUND(SUM(c.conversion_value) * 100.0 / (SELECT SUM(conversion_value) FROM conversions), 1) AS value_share_pct FROM first_channel fc JOIN conversions c ON fc.user_id = c.user_id GROUP BY fc.first_touch_channel ORDER BY total_value DESC;

输出示例:

first_touch_channelconversionstotal_valuevalue_share_pct
paid_ads120040.0
social_media115030.0
organic_search110020.0
email15010.0

2. 末次触达归因 (Last-Touch)

规则

转化金额 100% 归因给转化前的 最后一个触点

SQL 实现

WITH touchpoints_ranked AS ( SELECT user_id, channel, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY event_time DESC) AS touch_order_desc FROM marketing_touchpoints WHERE event_type != 'purchase' ), last_channel AS ( SELECT user_id, channel AS last_touch_channel FROM touchpoints_ranked WHERE touch_order_desc = 1 ) SELECT lc.last_touch_channel, COUNT(*) AS conversions, SUM(c.conversion_value) AS total_value, ROUND(SUM(c.conversion_value) * 100.0 / (SELECT SUM(conversion_value) FROM conversions), 1) AS value_share_pct FROM last_channel lc JOIN conversions c ON lc.user_id = c.user_id GROUP BY lc.last_touch_channel ORDER BY total_value DESC;

输出示例:

last_touch_channelconversionstotal_valuevalue_share_pct
paid_ads225050.0
email225050.0

3. 线性归因 (Linear)

规则

转化金额 平均分配 给所有触点。

SQL 实现

WITH touch_counts AS ( SELECT user_id, COUNT(*) AS touch_count FROM marketing_touchpoints WHERE event_type != 'purchase' GROUP BY user_id ), attribution AS ( SELECT t.user_id, t.channel, c.conversion_value / tc.touch_count AS attributed_value FROM marketing_touchpoints t JOIN conversions c ON t.user_id = c.user_id JOIN touch_counts tc ON t.user_id = tc.user_id WHERE t.event_type != 'purchase' ) SELECT channel, COUNT(*) AS touch_count, ROUND(SUM(attributed_value), 2) AS total_attributed_value, ROUND(SUM(attributed_value) * 100.0 / (SELECT SUM(conversion_value) FROM conversions), 1) AS value_share_pct FROM attribution GROUP BY channel ORDER BY total_attributed_value DESC;

输出示例:

channeltouch_counttotal_attributed_valuevalue_share_pct
paid_ads3170.8334.2
email2150.0030.0
social_media3108.3321.7
organic_search270.8314.2

4. 时间衰减归因 (Time-Decay)

规则

越接近转化的触点获得越高权重,使用 指数衰减

weight = 2(-hours_to_conversion / half_life_hours)

半衰期默认 24 小时(距离转化 1 天的触点权重减半)。

SQL 实现

WITH conversions AS ( -- 实际生产中从订单表获取转化时间 SELECT 1 AS user_id, '2026-03-02 10:30:00' AS conversion_time, 100.0 AS conversion_value UNION ALL SELECT 2, '2026-03-02 09:00:00', 200.0 UNION ALL SELECT 3, '2026-03-02 14:30:00', 150.0 UNION ALL SELECT 4, '2026-03-01 12:30:00', 50.0 ), touch_with_decay AS ( SELECT t.user_id, t.channel, c.conversion_value, TIMESTAMPDIFF(HOUR, CAST(t.event_time AS TIMESTAMP_LTZ), CAST(c.conversion_time AS TIMESTAMP_LTZ)) AS hours_to_conversion, -- 指数衰减权重: 2(-hours/24),半衰期 24 小时 POW(2, -TIMESTAMPDIFF(HOUR, CAST(t.event_time AS TIMESTAMP_LTZ), CAST(c.conversion_time AS TIMESTAMP_LTZ)) / 24.0) AS decay_weight FROM marketing_touchpoints t JOIN conversions c ON t.user_id = c.user_id WHERE t.event_type != 'purchase' ), normalized AS ( SELECT user_id, channel, conversion_value, decay_weight, decay_weight / SUM(decay_weight) OVER (PARTITION BY user_id) AS normalized_weight FROM touch_with_decay ) SELECT channel, COUNT(*) AS touch_count, ROUND(SUM(conversion_value * normalized_weight), 2) AS total_attributed_value, ROUND(SUM(conversion_value * normalized_weight) * 100.0 / (SELECT SUM(conversion_value) FROM conversions), 1) AS value_share_pct FROM normalized GROUP BY channel ORDER BY total_attributed_value DESC;

输出示例:

channeltouch_counttotal_attributed_valuevalue_share_pct
paid_ads3190.9038.2
email2161.5032.3
social_media396.3019.3
organic_search251.2910.3

半衰期选择

业务类型推荐半衰期原因
电商促销6-12 小时决策周期短
SaaS 试用24-48 小时中等决策周期
B2B 销售72-168 小时长决策周期

5. 位置归因 (U-Shape)

规则

触点位置权重
首个触点40%
末个触点40%
中间触点平分剩余 20%

特殊情况:

  • 单触点: 100%
  • 2 触点: 各 50%

SQL 实现

WITH touchpoints_ranked AS ( SELECT t.user_id, t.channel, c.conversion_value, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.event_time) AS touch_order, COUNT(*) OVER (PARTITION BY t.user_id) AS total_touches FROM marketing_touchpoints t JOIN conversions c ON t.user_id = c.user_id WHERE t.event_type != 'purchase' ), attribution AS ( SELECT user_id, channel, conversion_value, CASE WHEN total_touches = 1 THEN 1.0 WHEN total_touches = 2 THEN 0.5 WHEN touch_order = 1 THEN 0.4 WHEN touch_order = total_touches THEN 0.4 ELSE 0.2 / (total_touches - 2) END AS weight FROM touchpoints_ranked ) SELECT channel, COUNT(*) AS touch_count, ROUND(SUM(conversion_value * weight), 2) AS total_attributed_value, ROUND(SUM(conversion_value * weight) * 100.0 / (SELECT SUM(conversion_value) FROM conversions), 1) AS value_share_pct FROM attribution GROUP BY channel ORDER BY total_attributed_value DESC;

输出示例:

channeltouch_counttotal_attributed_valuevalue_share_pct
paid_ads3200.0040.0
email2150.0030.0
social_media395.0019.0
organic_search255.0011.0

6. 多模型对比

一键对比所有模型

-- 将各模型结果合并对比 WITH first_touch AS ( -- 首次触达归因结果 SELECT channel, ROUND(SUM(conversion_value), 2) AS first_touch_value FROM ( SELECT user_id, channel, conversion_value FROM touchpoints_ranked tr JOIN conversions c ON tr.user_id = c.user_id WHERE touch_order = 1 ) GROUP BY channel ), last_touch AS ( -- 末次触达归因结果 SELECT channel, ROUND(SUM(conversion_value), 2) AS last_touch_value FROM ( SELECT user_id, channel, conversion_value FROM touchpoints_ranked tr JOIN conversions c ON tr.user_id = c.user_id WHERE touch_order = total_touches ) GROUP BY channel ) -- 合并对比...


常见问题

1. 排除转化事件本身

-- 错误: 将 purchase 事件也计入触点 WHERE event_type IN ('view', 'click', 'purchase') -- 正确: 排除转化事件 WHERE event_type != 'purchase'

2. 2 触点 U-Shape 除零错误

-- 错误: total_touches = 2 时,0.2 / (2 - 2) 导致除零 ELSE 0.2 / (total_touches - 2) -- 正确: 单独处理 2 触点情况 WHEN total_touches = 2 THEN 0.5

3. 时间衰减的
POW
POW
函数

-- 正确: POW(base, exponent) POW(2, -hours / 24.0) -- 注意: 24.0 必须是浮点数,否则整数除法会截断为 0 POW(2, -hours / 24) -- 错误: hours/24 可能为 0

4. 多触点用户 vs 单触点用户

-- 建议在分析中区分 SELECT CASE WHEN total_touches = 1 THEN 'single_touch' WHEN total_touches <= 3 THEN 'multi_touch' ELSE 'high_touch' END AS journey_type, COUNT(*) AS user_count FROM touchpoints_ranked GROUP BY 1;

5. 归因窗口期

-- 限制归因窗口(如 30 天内触点) WHERE TIMESTAMPDIFF(DAY, CAST(t.event_time AS TIMESTAMP_LTZ), CAST(c.conversion_time AS TIMESTAMP_LTZ)) <= 30


模型选择建议

业务目标推荐模型原因
评估拉新渠道效果首次触达反映渠道获取新用户能力
优化转化漏斗末次触达反映临门一脚的渠道
公平分配预算线性所有触点平等对待
短周期促销时间衰减近期触点影响更大
全链路 ROI 报告U-Shape兼顾首末和中间触点

性能优化建议

场景优化策略
大数据量归因按转化日期分区,限制归因窗口
多模型并行计算将触点排序结果物化为 Dynamic Table
高频查询预计算归因结果,按天刷新

-- 推荐: 将触点排序物化为 Dynamic Table CREATE DYNAMIC TABLE dws_touchpoints_ranked REFRESH INTERVAL 1 HOUR AS SELECT t.user_id, t.channel, t.event_time, t.event_type, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.event_time) AS touch_order, COUNT(*) OVER (PARTITION BY t.user_id) AS total_touches FROM marketing_touchpoints t WHERE event_type != 'purchase';

联系我们
预约咨询
微信咨询
电话咨询