帮助你在 Microsoft Fabric 中搭建端到端青铜白银黄金湖仓分层与数据管道。
复制安装指令,让 AI 自动完成配置 · 推荐新手
请帮我安装 askskill 上的 "e2e-medallion-architecture" 技能: 1. 下载 https://raw.githubusercontent.com/microsoft/skills-for-fabric/main/skills/e2e-medallion-architecture/SKILL.md 2. 保存为 ~/.claude/skills/e2e-medallion-architecture/SKILL.md 3. 装好后重载技能,告诉我可以用了
请为 Microsoft Fabric 设计一个端到端 Medallion Architecture 方案,包含 Bronze、Silver、Gold 三层的职责划分、每层对应的 Lakehouse 与 Workspace 结构、数据流转路径,以及适合使用的 PySpark、Delta Lake 和 Pipeline 组件。
一份清晰的分层湖仓架构方案,说明各层职责、资源布局与技术组件选型。
请生成一个在 Microsoft Fabric 中实现 Bronze→Silver→Gold 数据处理的方案:使用 PySpark Notebook 完成原始数据摄取、清洗标准化、业务聚合,使用 Delta Lake 管理表,并通过 Fabric Pipelines 编排执行顺序。
一套从摄取到分析层产出的端到端实现步骤,包含 Notebook、表设计与编排流程。
请优化现有的 Microsoft Fabric Medallion Architecture,分别给出 Bronze、Silver、Gold 层的数据质量校验规则、异常处理策略,以及每层适合的 Spark 配置和性能优化建议。
按分层整理的数据质量治理与 Spark 优化建议,便于提升稳定性与处理效率。
Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
- GitHub Copilot CLI / VS Code: invoke the
check-updatesskill.- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
- Skip if the check was already performed earlier in this session.
CRITICAL NOTES
- To find the workspace details (including its ID) from workspace name: list all workspaces and, then, use JMESPath filtering
- To find the item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace and, then, use JMESPath filtering
Read these companion documents — they contain the foundational context this skill depends on:
az rest, az login, token acquisition, Fabric REST via CLI.ipynb structure requirements, cell format, getDefinition/updateDefinition workflowFor Spark-specific optimization details, see data-engineering-patterns.md.
Medallion Architecture is a data lakehouse pattern with three progressive layers:
| Layer | Purpose | Optimization Profile | Use Case |
|---|---|---|---|
| Bronze (Raw) | Land raw data exactly as received | Write-optimized, append-only, partitioned by ingestion date | Audit trail, reprocessing, lineage |
| Silver (Cleaned) | Deduplicated, validated, conformed data | Balanced read/write, partitioned by business date | Feature engineering, operational reporting |
| Gold (Aggregated) | Pre-calculated metrics for analytics | Read-optimized (ZORDER, compaction), partitioned by month/year | Power BI reports, dashboards, ad-hoc analytics via SQL endpoint |
mergeSchema when sources change.ipynb validation + Fabric nuances in notebook-api-operations.md when creating notebooks via REST API — every code cell must include "outputs": [] and "execution_count": null…
通过 PySpark 与 Livy 会话交互分析 Lakehouse 数据,支持高级计算与跨仓查询。
帮助你开发 Microsoft Fabric Spark 工作流、编写调试 Notebook 代码并管理湖仓与资源。