企业内部数据产品化(Data Mesh)最佳实践

将各业务域的数据以"数据产品"形式对外发布,每个域自主管理 Schema、定义数据合同、控制消费权限,跨域分析通过统一的数据产品层进行联邦查询,而不直接访问源表。本文以 Sales、HR、Finance 三个业务域为例,端到端演示 Domain Schema → Data Contract → Semantic View → RBAC → Cross-Domain Analytics 的完整落地路径。


概述

传统集中式数仓面临的核心问题,以及云器 Lakehouse 在 Data Mesh 架构下的解决方案:

问题云器解决方案
各业务团队争抢同一 Schema,互相干扰每个域独立 Schema(如
best_practice_data_mesh_sales
best_practice_data_mesh_sales
),完全自主
消费方直接查询源表,口径不统一Semantic View 封装业务逻辑,作为数据产品的稳定接口
谁能看哪些数据无法精细控制GRANT/REVOKE 到具体视图,Role 粒度授权
跨域 JOIN 需要数据搬移或人工对齐Dynamic Table 直接跨域 Schema 联邦查询,无需复制数据
数据产品被谁用了多少次不透明
sys.information_schema.job_history
sys.information_schema.job_history
追踪查询频率与 SLA

涉及的 SQL 命令

命令 / 函数用途说明
CREATE SCHEMA
CREATE SCHEMA
为每个业务域创建独立 Schema域间隔离的基础
CREATE TABLE
CREATE TABLE
在域 Schema 下建源数据表,列注释作为数据合同
COMMENT
COMMENT
字段即数据合同
CREATE VIEW
CREATE VIEW
在数据产品层创建 Semantic View封装业务逻辑,消费方查此视图
CREATE DYNAMIC TABLE
CREATE DYNAMIC TABLE
跨域联合聚合,自动增量刷新跨 Schema 引用无需数据复制
GRANT SELECT ON VIEW
GRANT SELECT ON VIEW
将视图访问权限授予角色数据产品所有者控制谁能消费
REVOKE SELECT ON VIEW
REVOKE SELECT ON VIEW
撤销访问权限数据产品下线或访客变更时使用
SHOW GRANTS ON VIEW
SHOW GRANTS ON VIEW
查看当前视图的权限列表审计数据产品消费方
REFRESH DYNAMIC TABLE
REFRESH DYNAMIC TABLE
手动触发跨域聚合刷新初次构建或调试时使用

前置准备

本文在以下四个 Schema 下运行,各域 Schema 完全独立,数据产品层独占一个 Schema:

-- 数据产品层(统一接口层) CREATE SCHEMA IF NOT EXISTS best_practice_data_mesh; -- 各业务域 Schema CREATE SCHEMA IF NOT EXISTS best_practice_data_mesh_sales; CREATE SCHEMA IF NOT EXISTS best_practice_data_mesh_hr; CREATE SCHEMA IF NOT EXISTS best_practice_data_mesh_finance;


Sales 域:建表与数据合同

建源数据表

Sales 域自主管理两张表。列注释(

COMMENT
COMMENT
)是数据合同的核心表达方式,消费方通过读注释理解字段语义,而不需要口头约定。

CREATE TABLE IF NOT EXISTS best_practice_data_mesh_sales.doc_sales_orders ( order_id STRING COMMENT 'Unique order identifier', customer_id STRING COMMENT 'Customer identifier', sales_rep_id STRING COMMENT 'Sales representative identifier', region STRING COMMENT 'Sales region', order_date DATE COMMENT 'Date the order was placed', status STRING COMMENT 'Order status: pending/confirmed/shipped/cancelled', total_amount DOUBLE COMMENT 'Total order amount in USD', currency STRING COMMENT 'Currency code' ) COMMENT 'Sales domain: order header table (data contract v1.0)'; CREATE TABLE IF NOT EXISTS best_practice_data_mesh_sales.doc_sales_items ( item_id STRING COMMENT 'Unique line item identifier', order_id STRING COMMENT 'Reference to doc_sales_orders.order_id', product_id STRING COMMENT 'Product identifier', product_name STRING COMMENT 'Product display name', category STRING COMMENT 'Product category', quantity INT COMMENT 'Ordered quantity', unit_price DOUBLE COMMENT 'Unit price at time of order', line_total DOUBLE COMMENT 'quantity * unit_price' ) COMMENT 'Sales domain: order line items table (data contract v1.0)';

表级

COMMENT
COMMENT
末尾写版本号(
data contract v1.0
data contract v1.0
),方便后续通过
DESC TABLE
DESC TABLE
查询合同版本,判断消费方是否需要适配。

写入模拟数据

从本地 CSV 导入数据(推荐):

-- 第一步:通过 SQL PUT 将本地 CSV 文件上传到 User Volume PUT '/path/to/your/doc_sales_orders.csv' TO USER VOLUME FILE 'doc_sales_orders.csv';

-- 第二步:从 User Volume COPY INTO 表 COPY INTO best_practice_data_mesh_sales.doc_sales_orders FROM USER VOLUME USING csv OPTIONS('header'='true', 'sep'=',', 'nullValue'='') FILES ('doc_sales_orders.csv');

也可直接内联插入小批量测试数据(不需要 CSV 文件):

INSERT INTO best_practice_data_mesh_sales.doc_sales_orders VALUES ('ORD001','CUST001','REP001','APAC', CAST('2026-01-05' AS DATE),'shipped', 15200.00,'USD'), ('ORD002','CUST002','REP001','APAC', CAST('2026-01-08' AS DATE),'shipped', 8500.00,'USD'), ('ORD003','CUST003','REP002','EMEA', CAST('2026-01-12' AS DATE),'confirmed', 23000.00,'USD'), ('ORD004','CUST004','REP003','AMER', CAST('2026-01-15' AS DATE),'cancelled', 4200.00,'USD'), ('ORD005','CUST005','REP002','EMEA', CAST('2026-01-20' AS DATE),'shipped', 31500.00,'USD'), ('ORD006','CUST001','REP001','APAC', CAST('2026-02-03' AS DATE),'shipped', 9800.00,'USD'), ('ORD007','CUST006','REP004','AMER', CAST('2026-02-10' AS DATE),'pending', 6750.00,'USD'), ('ORD008','CUST007','REP003','AMER', CAST('2026-02-14' AS DATE),'shipped', 18400.00,'USD'), ('ORD009','CUST008','REP002','EMEA', CAST('2026-02-18' AS DATE),'confirmed', 11200.00,'USD'), ('ORD010','CUST009','REP005','APAC', CAST('2026-02-25' AS DATE),'shipped', 27600.00,'USD'), ('ORD011','CUST010','REP004','AMER', CAST('2026-03-02' AS DATE),'shipped', 5300.00,'USD'), ('ORD012','CUST002','REP001','APAC', CAST('2026-03-07' AS DATE),'shipped', 14100.00,'USD'); INSERT INTO best_practice_data_mesh_sales.doc_sales_items VALUES ('ITEM001','ORD001','PROD001','Laptop Pro 15','Electronics', 2, 4200.00, 8400.00), ('ITEM002','ORD001','PROD002','USB-C Hub', 'Accessories', 5, 160.00, 800.00), ('ITEM003','ORD001','PROD003','Laptop Bag', 'Accessories', 2, 80.00, 160.00), ('ITEM004','ORD002','PROD004','Wireless Mouse','Peripherals', 10, 85.00, 850.00), ('ITEM005','ORD003','PROD001','Laptop Pro 15', 'Electronics', 4, 4200.00, 16800.00), ('ITEM006','ORD003','PROD005','Monitor 27in', 'Electronics', 2, 3100.00, 6200.00), ('ITEM007','ORD005','PROD001','Laptop Pro 15', 'Electronics', 6, 4200.00, 25200.00), ('ITEM008','ORD005','PROD003','Laptop Bag', 'Accessories', 6, 80.00, 480.00), ('ITEM009','ORD006','PROD002','USB-C Hub', 'Accessories', 3, 160.00, 480.00), ('ITEM010','ORD006','PROD004','Wireless Mouse','Peripherals', 8, 85.00, 680.00), ('ITEM011','ORD008','PROD005','Monitor 27in', 'Electronics', 4, 3100.00, 12400.00), ('ITEM012','ORD010','PROD001','Laptop Pro 15', 'Electronics', 5, 4200.00, 21000.00), ('ITEM013','ORD010','PROD006','Keyboard MX', 'Peripherals', 10, 195.00, 1950.00), ('ITEM014','ORD011','PROD004','Wireless Mouse','Peripherals', 5, 85.00, 425.00), ('ITEM015','ORD012','PROD001','Laptop Pro 15', 'Electronics', 2, 4200.00, 8400.00);

验证 Sales 域数据量:

SELECT COUNT(*) AS order_count FROM best_practice_data_mesh_sales.doc_sales_orders; SELECT COUNT(*) AS item_count FROM best_practice_data_mesh_sales.doc_sales_items;

order_count ----------- 12 item_count ---------- 15


HR 域:建表与数据合同

建源数据表

HR 域包含员工和部门两张表。

salary
salary
字段在源表中存储真实值,但不会直接暴露给消费方——数据产品层的 Semantic View 会剔除该字段,只输出组织结构维度。

CREATE TABLE IF NOT EXISTS best_practice_data_mesh_hr.doc_departments ( dept_id STRING COMMENT 'Unique department identifier', dept_name STRING COMMENT 'Department name', cost_center STRING COMMENT 'Finance cost center code', location STRING COMMENT 'Office location', head_count INT COMMENT 'Planned headcount' ) COMMENT 'HR domain: department master table (data contract v1.0)'; CREATE TABLE IF NOT EXISTS best_practice_data_mesh_hr.doc_employees ( employee_id STRING COMMENT 'Unique employee identifier', dept_id STRING COMMENT 'Reference to doc_departments.dept_id', full_name STRING COMMENT 'Employee full name', job_title STRING COMMENT 'Job title', hire_date DATE COMMENT 'Date of hire', salary DOUBLE COMMENT 'Annual salary in USD', status STRING COMMENT 'active / on_leave / terminated', manager_id STRING COMMENT 'Direct manager employee_id (nullable for top level)' ) COMMENT 'HR domain: employee master table (data contract v1.0)';

写入模拟数据

INSERT INTO best_practice_data_mesh_hr.doc_departments VALUES ('DEPT01','Sales', 'CC-SALE','Shanghai', 30), ('DEPT02','Engineering', 'CC-ENG', 'Beijing', 80), ('DEPT03','Finance', 'CC-FIN', 'Shanghai', 20), ('DEPT04','Human Resources','CC-HR', 'Shanghai', 15), ('DEPT05','Marketing', 'CC-MKT', 'Shenzhen', 25); INSERT INTO best_practice_data_mesh_hr.doc_employees VALUES ('EMP001','DEPT01','Alice Wang', 'Sales Manager', CAST('2020-03-01' AS DATE),180000.00,'active', NULL), ('EMP002','DEPT01','Bob Chen', 'Sales Rep', CAST('2021-06-15' AS DATE),110000.00,'active', 'EMP001'), ('EMP003','DEPT01','Carol Liu', 'Sales Rep', CAST('2021-09-20' AS DATE),108000.00,'active', 'EMP001'), ('EMP004','DEPT02','David Zhang', 'Engineering Lead', CAST('2019-01-10' AS DATE),220000.00,'active', NULL), ('EMP005','DEPT02','Eva Sun', 'Senior Engineer', CAST('2020-07-22' AS DATE),180000.00,'active', 'EMP004'), ('EMP006','DEPT02','Frank Zhao', 'Engineer', CAST('2022-04-05' AS DATE),140000.00,'active', 'EMP004'), ('EMP007','DEPT03','Grace Li', 'Finance Manager', CAST('2018-11-01' AS DATE),190000.00,'active', NULL), ('EMP008','DEPT03','Henry Wu', 'Accountant', CAST('2021-02-28' AS DATE),120000.00,'active', 'EMP007'), ('EMP009','DEPT04','Ivy Zhou', 'HR Manager', CAST('2019-05-15' AS DATE),160000.00,'active', NULL), ('EMP010','DEPT04','Jack Luo', 'HR Specialist', CAST('2022-08-01' AS DATE),100000.00,'on_leave','EMP009'), ('EMP011','DEPT05','Karen Xu', 'Marketing Manager', CAST('2020-10-12' AS DATE),170000.00,'active', NULL), ('EMP012','DEPT01','Leo Ma', 'Sales Rep', CAST('2023-01-16' AS DATE),105000.00,'active', 'EMP001');


Finance 域:建表与数据合同

建源数据表

Finance 域管理发票和收款两张表,通过

invoice_id
invoice_id
关联,构成应收账款(AR)体系。

CREATE TABLE IF NOT EXISTS best_practice_data_mesh_finance.doc_invoices ( invoice_id STRING COMMENT 'Unique invoice identifier', order_id STRING COMMENT 'Reference to sales domain order_id', customer_id STRING COMMENT 'Customer identifier', issue_date DATE COMMENT 'Invoice issue date', due_date DATE COMMENT 'Payment due date', amount DOUBLE COMMENT 'Invoice amount in USD', status STRING COMMENT 'draft / issued / paid / overdue' ) COMMENT 'Finance domain: invoice header table (data contract v1.0)'; CREATE TABLE IF NOT EXISTS best_practice_data_mesh_finance.doc_payments ( payment_id STRING COMMENT 'Unique payment identifier', invoice_id STRING COMMENT 'Reference to doc_invoices.invoice_id', payment_date DATE COMMENT 'Date payment was received', amount_paid DOUBLE COMMENT 'Amount paid in USD', method STRING COMMENT 'Payment method: bank_transfer / credit_card / check', status STRING COMMENT 'completed / failed / pending' ) COMMENT 'Finance domain: payment records table (data contract v1.0)';

写入模拟数据

INSERT INTO best_practice_data_mesh_finance.doc_invoices VALUES ('INV001','ORD001','CUST001', CAST('2026-01-06' AS DATE), CAST('2026-02-06' AS DATE), 15200.00,'paid'), ('INV002','ORD002','CUST002', CAST('2026-01-09' AS DATE), CAST('2026-02-09' AS DATE), 8500.00,'paid'), ('INV003','ORD003','CUST003', CAST('2026-01-13' AS DATE), CAST('2026-02-13' AS DATE), 23000.00,'paid'), ('INV004','ORD005','CUST005', CAST('2026-01-21' AS DATE), CAST('2026-02-21' AS DATE), 31500.00,'overdue'), ('INV005','ORD006','CUST001', CAST('2026-02-04' AS DATE), CAST('2026-03-04' AS DATE), 9800.00,'paid'), ('INV006','ORD008','CUST007', CAST('2026-02-15' AS DATE), CAST('2026-03-15' AS DATE), 18400.00,'issued'), ('INV007','ORD009','CUST008', CAST('2026-02-19' AS DATE), CAST('2026-03-19' AS DATE), 11200.00,'paid'), ('INV008','ORD010','CUST009', CAST('2026-02-26' AS DATE), CAST('2026-03-26' AS DATE), 27600.00,'issued'), ('INV009','ORD011','CUST010', CAST('2026-03-03' AS DATE), CAST('2026-04-03' AS DATE), 5300.00,'paid'), ('INV010','ORD012','CUST002', CAST('2026-03-08' AS DATE), CAST('2026-04-08' AS DATE), 14100.00,'draft'); INSERT INTO best_practice_data_mesh_finance.doc_payments VALUES ('PAY001','INV001', CAST('2026-01-28' AS DATE), 15200.00,'bank_transfer','completed'), ('PAY002','INV002', CAST('2026-02-01' AS DATE), 8500.00,'credit_card', 'completed'), ('PAY003','INV003', CAST('2026-02-10' AS DATE), 23000.00,'bank_transfer','completed'), ('PAY004','INV005', CAST('2026-02-25' AS DATE), 9800.00,'bank_transfer','completed'), ('PAY005','INV007', CAST('2026-03-05' AS DATE), 11200.00,'credit_card', 'completed'), ('PAY006','INV009', CAST('2026-03-20' AS DATE), 5300.00,'check', 'completed');


数据产品层:Semantic View

数据产品层(

best_practice_data_mesh
best_practice_data_mesh
)是消费方唯一的访问入口。每个数据产品是一个 Semantic View,封装业务过滤、JOIN、计算逻辑,消费方只看到已"净化"的字段,不感知域内的物理表结构。

Sales 数据产品:订单收入视图

CREATE OR REPLACE VIEW best_practice_data_mesh.dp_sales_revenue AS SELECT o.order_id, o.customer_id, o.region, o.order_date, o.status, o.total_amount, i.product_name, i.category, i.quantity, i.unit_price, i.line_total FROM best_practice_data_mesh_sales.doc_sales_orders o JOIN best_practice_data_mesh_sales.doc_sales_items i ON o.order_id = i.order_id WHERE o.status NOT IN ('cancelled');

这个视图做了两件事:JOIN 合并订单头和行项目,同时过滤掉

cancelled
cancelled
状态的订单。消费方查询时无需关心源表的 JOIN 逻辑,也不会意外包含已取消订单影响收入统计。

验证分区域收入:

SELECT region, COUNT(*) AS orders, ROUND(SUM(total_amount), 2) AS revenue FROM best_practice_data_mesh.dp_sales_revenue GROUP BY region ORDER BY revenue DESC;

region | orders | revenue -------+--------+--------- APAC | 9 | 143000 EMEA | 4 | 109000 AMER | 2 | 23700

按品类看销售额分布:

SELECT category, COUNT(*) AS item_lines, ROUND(SUM(line_total), 2) AS category_revenue FROM best_practice_data_mesh.dp_sales_revenue GROUP BY category ORDER BY category_revenue DESC;

category | item_lines | category_revenue ------------+------------+------------------ Electronics | 7 | 98400 Peripherals | 4 | 3905 Accessories | 4 | 1920

Electronics 品类贡献了 98,400 USD 收入,占比超过 94%,是销售额的绝对主力。

HR 数据产品:组织结构视图

HR 域对外只暴露组织维度,

salary
salary
字段不出现在数据产品视图中。同时只返回
status = 'active'
status = 'active'
的在职员工,
on_leave
on_leave
terminated
terminated
记录在域内保留。

CREATE OR REPLACE VIEW best_practice_data_mesh.dp_hr_org AS SELECT e.employee_id, e.dept_id, d.dept_name, d.cost_center, d.location, e.job_title, e.hire_date, e.status, e.manager_id FROM best_practice_data_mesh_hr.doc_employees e JOIN best_practice_data_mesh_hr.doc_departments d ON e.dept_id = d.dept_id WHERE e.status = 'active';

验证各部门在职人数:

SELECT dept_name, COUNT(*) AS headcount, location FROM best_practice_data_mesh.dp_hr_org GROUP BY dept_name, location ORDER BY headcount DESC;

dept_name | headcount | location -----------------+-----------+--------- Sales | 4 | Shanghai Engineering | 3 | Beijing Finance | 2 | Shanghai Marketing | 1 | Shenzhen Human Resources | 1 | Shanghai

Finance 数据产品:应收账款视图

Finance 域将发票和收款两张表聚合,对外暴露每张发票的应收余额和 AR 状态。

CREATE OR REPLACE VIEW best_practice_data_mesh.dp_finance_ar AS SELECT inv.invoice_id, inv.order_id, inv.customer_id, inv.issue_date, inv.due_date, inv.amount, inv.status AS invoice_status, COALESCE(SUM(pay.amount_paid), 0) AS total_paid, inv.amount - COALESCE(SUM(pay.amount_paid), 0) AS outstanding_balance, CASE WHEN inv.status = 'paid' THEN 'CLOSED' WHEN inv.status = 'overdue' THEN 'OVERDUE' WHEN CURRENT_DATE() > inv.due_date THEN 'OVERDUE' ELSE 'OPEN' END AS ar_status FROM best_practice_data_mesh_finance.doc_invoices inv LEFT JOIN best_practice_data_mesh_finance.doc_payments pay ON inv.invoice_id = pay.invoice_id GROUP BY inv.invoice_id, inv.order_id, inv.customer_id, inv.issue_date, inv.due_date, inv.amount, inv.status;

验证应收账款汇总:

SELECT ar_status, COUNT(*) AS cnt, ROUND(SUM(outstanding_balance), 2) AS total_outstanding FROM best_practice_data_mesh.dp_finance_ar GROUP BY ar_status ORDER BY total_outstanding DESC;

ar_status | cnt | total_outstanding ----------+-----+------------------ OVERDUE | 4 | 91600 CLOSED | 6 | 0

当前有 4 张逾期发票,合计未收金额 91,600 USD,是财务团队需要重点跟进的 AR 风险。

查看明细:

SELECT invoice_id, customer_id, amount, total_paid, outstanding_balance, ar_status FROM best_practice_data_mesh.dp_finance_ar ORDER BY ar_status, outstanding_balance DESC;

invoice_id | customer_id | amount | total_paid | outstanding_balance | ar_status -----------+-------------+---------+------------+---------------------+---------- INV004 | CUST005 | 31500 | 0 | 31500 | OVERDUE INV008 | CUST009 | 27600 | 0 | 27600 | OVERDUE INV006 | CUST007 | 18400 | 0 | 18400 | OVERDUE INV010 | CUST002 | 14100 | 0 | 14100 | OVERDUE INV001 | CUST001 | 15200 | 15200 | 0 | CLOSED INV002 | CUST002 | 8500 | 8500 | 0 | CLOSED INV003 | CUST003 | 23000 | 23000 | 0 | CLOSED INV005 | CUST001 | 9800 | 9800 | 0 | CLOSED INV007 | CUST008 | 11200 | 11200 | 0 | CLOSED INV009 | CUST010 | 5300 | 5300 | 0 | CLOSED


RBAC:权限控制

数据产品所有者通过

GRANT SELECT ON VIEW
GRANT SELECT ON VIEW
控制谁能访问哪个数据产品。授权对象是视图,而非源表——消费方即使拥有 Schema 访问权限,也无法绕过视图直接查询源表(源表的 SELECT 权限未授出)。

授权数据产品给消费角色

-- 将三个数据产品视图授权给分析师角色 GRANT SELECT ON VIEW best_practice_data_mesh.dp_sales_revenue TO ROLE workspace_analyst; GRANT SELECT ON VIEW best_practice_data_mesh.dp_hr_org TO ROLE workspace_analyst; GRANT SELECT ON VIEW best_practice_data_mesh.dp_finance_ar TO ROLE workspace_analyst;

验证授权结果

SHOW GRANTS ON VIEW best_practice_data_mesh.dp_sales_revenue;

granted_type | privilege | granted_on | object_name | grantee_name | granted_time -------------+----------------+------------+------------------------------------------------+---------------------------+------------- PRIVILEGE | SELECT VIEW | VIEW | quick_start.best_practice_data_mesh.dp_sales_revenue | quick_start.workspace_analyst | 2026-06-06 ...

撤销权限

当数据产品下线或消费方需要变更权限时:

REVOKE SELECT ON VIEW best_practice_data_mesh.dp_sales_revenue FROM ROLE workspace_analyst;


跨域联合分析

跨域关联查询:订单 + 应收款状态

不同域的数据产品可以直接 JOIN,无需复制数据。Finance 团队查看哪些订单存在欠款风险:

CREATE OR REPLACE VIEW best_practice_data_mesh.dp_cross_domain_order_ar AS SELECT s.order_id, s.customer_id, s.region, s.order_date, ROUND(s.total_amount, 2) AS order_amount, f.invoice_status, ROUND(f.total_paid, 2) AS total_paid, ROUND(f.outstanding_balance, 2) AS outstanding_balance, f.ar_status FROM best_practice_data_mesh.dp_sales_revenue s LEFT JOIN best_practice_data_mesh.dp_finance_ar f ON s.order_id = f.order_id;

查询逾期或开放中的订单:

SELECT order_id, customer_id, order_amount, invoice_status, outstanding_balance, ar_status FROM best_practice_data_mesh.dp_cross_domain_order_ar WHERE ar_status IN ('OVERDUE', 'OPEN') ORDER BY outstanding_balance DESC;

order_id | customer_id | order_amount | invoice_status | outstanding_balance | ar_status ---------+-------------+--------------+----------------+---------------------+---------- ORD005 | CUST005 | 31500 | overdue | 31500 | OVERDUE ORD010 | CUST009 | 27600 | issued | 27600 | OVERDUE ORD008 | CUST007 | 18400 | issued | 18400 | OVERDUE ORD012 | CUST002 | 14100 | draft | 14100 | OVERDUE

CUST005 的 ORD005(31,500 USD)已逾期且未收款,是最高风险账款。

跨域 Dynamic Table:销售代表绩效

跨域 Dynamic Table 直接引用 Sales 和 HR 两个域的源表,自动增量聚合,无需手动 ETL。

CREATE DYNAMIC TABLE IF NOT EXISTS best_practice_data_mesh.dp_sales_rep_performance AS SELECT o.sales_rep_id, emp.full_name AS rep_name, emp.dept_id, o.region, COUNT(DISTINCT o.order_id) AS order_count, ROUND(SUM(o.total_amount), 2) AS total_revenue, ROUND(AVG(o.total_amount), 2) AS avg_order_value, SUM(CASE WHEN o.status = 'shipped' THEN 1 ELSE 0 END) AS shipped_count FROM best_practice_data_mesh_sales.doc_sales_orders o LEFT JOIN best_practice_data_mesh_hr.doc_employees emp ON o.sales_rep_id = emp.employee_id WHERE o.status != 'cancelled' GROUP BY o.sales_rep_id, emp.full_name, emp.dept_id, o.region;

手动触发首次刷新:

REFRESH DYNAMIC TABLE best_practice_data_mesh.dp_sales_rep_performance;

查看刷新结果:

SELECT sales_rep_id, rep_name, region, order_count, total_revenue, avg_order_value, shipped_count FROM best_practice_data_mesh.dp_sales_rep_performance ORDER BY total_revenue DESC;

sales_rep_id | rep_name | region | order_count | total_revenue | avg_order_value | shipped_count -------------+----------+--------+-------------+---------------+-----------------+-------------- REP002 | null | EMEA | 3 | 65700 | 21900 | 1 REP001 | null | APAC | 4 | 47600 | 11900 | 4 REP005 | null | APAC | 1 | 27600 | 27600 | 1 REP003 | null | AMER | 1 | 18400 | 18400 | 1 REP004 | null | AMER | 2 | 12050 | 6025 | 1

REP002 和 REP001 是 EMEA 和 APAC 的主力销售,分别贡献 65,700 USD 和 47,600 USD。

配置定期刷新(Studio Task)

Dynamic Table 的调度刷新通过在 Studio 中创建任务管理,不在 DDL 里写

REFRESH INTERVAL
REFRESH INTERVAL
。操作路径:

  1. 进入 开发 -> 任务,选择
    skill_test
    skill_test
    profile
  2. 在路径
    best_practices/data_mesh/
    best_practices/data_mesh/
    下新建任务,命名
    refresh_dp_sales_rep_performance
    refresh_dp_sales_rep_performance
  3. 任务 SQL 填写:
    REFRESH DYNAMIC TABLE best_practice_data_mesh.dp_sales_rep_performance;
    REFRESH DYNAMIC TABLE best_practice_data_mesh.dp_sales_rep_performance;
  4. 配置 Cron 调度(如每小时:
    0 * * * *
    0 * * * *
  5. 在同一任务上附加数据质量规则(如
    order_count > 0
    order_count > 0
    )和告警通知

数据产品使用监控

通过

sys.information_schema.job_history
sys.information_schema.job_history
追踪数据产品的查询频率和 SLA 合规情况:

SELECT SUBSTR(job_text, 1, 60) AS sql_preview, COUNT(*) AS query_count, ROUND(AVG(execution_time), 3) AS avg_elapsed_sec, MAX(rows_produced) AS max_rows_produced, SUM(CASE WHEN status = 'SUCCEED' THEN 1 ELSE 0 END) AS success_count FROM sys.information_schema.job_history WHERE workspace_name = 'quick_start' AND pt_date = CURRENT_DATE() AND job_text LIKE '%best_practice_data_mesh%' GROUP BY SUBSTR(job_text, 1, 60) ORDER BY query_count DESC LIMIT 10;

sql_preview | query_count | avg_elapsed_sec | max_rows_produced | success_count -------------------------------------------------------------+-------------+-----------------+-------------------+-------------- INSERT INTO best_practice_data_mesh_sales.doc_sales_orders V | 2 | 2.649 | 0 | 1 CREATE TABLE IF NOT EXISTS best_practice_data_mesh_finance.d | 2 | 0.061 | 0 | 2 CREATE TABLE IF NOT EXISTS best_practice_data_mesh_sales.doc | 2 | 0.053 | 0 | 2 SELECT invoice_id, customer_id, amount, total_paid, outstand | 1 | 0.318 | 16 | 1 CREATE SCHEMA IF NOT EXISTS best_practice_data_mesh_sales | 1 | 0.017 | 0 | 1 SELECT order_id, customer_id, order_amount, invoice_status, | 1 | 0.300 | 36 | 1 CREATE OR REPLACE VIEW best_practice_data_mesh.dp_sales_reve | 1 | 0.047 | 0 | 1

通过

job_text LIKE '%dp_%'
job_text LIKE '%dp_%'
筛选数据产品视图的查询记录,可以统计每个数据产品的日均查询次数、P99 响应时间,作为 SLA 报告的输入依据。


对象总览

best_practice_data_mesh
best_practice_data_mesh
数据产品层的所有对象:

SHOW TABLES IN best_practice_data_mesh;

schema_name | table_name | is_dynamic -----------------------------+-------------------------------+----------- best_practice_data_mesh | dp_finance_ar | false best_practice_data_mesh | dp_hr_org | false best_practice_data_mesh | dp_sales_rep_performance | true best_practice_data_mesh | dp_sales_revenue | false best_practice_data_mesh | dp_cross_domain_order_ar | false

三个 Semantic View(

is_dynamic = false
is_dynamic = false
)对应各域的单域数据产品,一个 Dynamic Table(
dp_sales_rep_performance
dp_sales_rep_performance
)负责跨域聚合,一个跨域 View(
dp_cross_domain_order_ar
dp_cross_domain_order_ar
)支持 Sales + Finance 的联合查询。


注意事项

  • 源表不授权给消费方:GRANT 只授权到数据产品视图,域内源表的 SELECT 权限只有域所有者持有。如果某角色通过继承权限可以访问源 Schema,需要在 Schema 级别显式 REVOKE 或通过角色设计隔离。

  • 数据合同版本化:表级

    COMMENT
    COMMENT
    末尾加版本号(
    data contract v1.0
    data contract v1.0
    ),列
    COMMENT
    COMMENT
    变更时同步更新版本号,并通知已授权的消费方。消费方在
    DESC TABLE
    DESC TABLE
    SHOW TABLES
    SHOW TABLES
    时可以看到合同版本,主动适配。

  • 跨域 ID 对齐是隐性耦合点:不同域使用不同的实体编码规则(如

    REP001
    REP001
    vs
    EMP001
    EMP001
    )时,跨域 JOIN 必须通过数据合同明确对齐路径,或由某一方提供 ID 映射视图。这是 Data Mesh 落地中最常见的摩擦点,需要跨域协商,不能单方面假设。

  • Dynamic Table 跨域引用的权限

    dp_sales_rep_performance
    dp_sales_rep_performance
    引用
    best_practice_data_mesh_sales
    best_practice_data_mesh_sales
    best_practice_data_mesh_hr
    best_practice_data_mesh_hr
    两个域的表,创建 DT 的用户需要对两个域的源表都有 SELECT 权限。如果 DT 由数据产品层的技术团队创建,需要提前向两个域申请只读权限。

  • 视图层级与性能

    dp_cross_domain_order_ar
    dp_cross_domain_order_ar
    查询的是
    dp_sales_revenue
    dp_sales_revenue
    dp_finance_ar
    dp_finance_ar
    两个视图,每次查询会展开两层视图再执行。如果查询频率高且性能成为瓶颈,可将跨域联合视图改为 Dynamic Table,以物化结果换取查询速度。


相关文档

联系我们
预约咨询
微信咨询
电话咨询
邮件咨询