spark-authoring-cli

帮助你开发 Microsoft Fabric 的 Spark 数据工程流程、Notebook 代码与基础设施配置。

星标

★ 420

来源

GitHub

更新于

2026-06-07

// 安全评估低风险

仅提示词，不执行代码
开源可审计
社区验证· 420

正在进行安全审计…

凭证密钥
网络外发
代码执行
数据访问
来源供应链

// 安装

复制安装指令，让 AI 自动完成配置 · 推荐新手

请帮我安装 askskill 上的 "spark-authoring-cli" 技能：
1. 下载 https://raw.githubusercontent.com/microsoft/skills-for-fabric/main/plugins/fabric-authoring/skills/spark-authoring-cli/SKILL.md
2. 保存为 ~/.claude/skills/spark-authoring-cli/SKILL.md
3. 装好后重载技能，告诉我可以用了

// 下载

下载 SKILL.md机读安装清单 ↗

// 用法示例

编写 Fabric Notebook 单元代码

输入

请为 Microsoft Fabric Notebook 编写一个 PySpark 示例：从 Lakehouse 的 sales 表读取数据，按地区汇总销售额，并将结果保存为 Delta 表。同时解释每个代码单元的作用。

预期产出

得到可直接粘贴到 Fabric Notebook 的 PySpark 单元代码，以及逐步说明。

设计数据工程流水线

输入

我需要在 Fabric 中设计一个数据工程流程：从原始 CSV 落地到 Bronze、Silver、Gold 分层，包含数据清洗、增量处理、质量检查和调度建议，请给出 Notebook 与 Pipeline 的设计方案。

预期产出

得到符合 Lakehouse 分层模式的流程设计、关键 Notebook 结构和调度建议。

配置工作区与基础设施

输入

请帮我规划 Microsoft Fabric 项目的工作区、Lakehouse、Notebook 与部署结构，并给出适合团队协作的基础设施即代码思路，包括环境划分、命名规范和发布流程建议。

预期产出

得到 Fabric 资源组织方案、环境与命名规范，以及部署实施建议。

// 文档

Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.

GitHub Copilot CLI / VS Code: invoke the check-updates skill.

Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.

Skip if the check was already performed earlier in this session.

CRITICAL NOTES

To find the workspace details (including its ID) from workspace name: list all workspaces and, then, use JMESPath filtering

To find the item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace and, then, use JMESPath filtering

Spark Authoring — CLI Skill

This skill covers two complementary areas: (1) managing Fabric Spark artifacts via REST APIs (workspaces, lakehouses, notebooks, jobs, pipelines) and (2) writing code inside Fabric Notebook cells (PySpark, Scala, SparkR, SQL with correct lakehouse access, notebookutils, and Spark configuration). For notebook code authoring fundamentals and shared modules, MUST see SPARK-NOTEBOOK-AUTHORING-CORE.md.

Task	Reference	Notes
RULES — Read these first, follow them always	SKILL.md § RULES	MUST read — 4 rules for this skill
Finding Workspaces and Items in Fabric	COMMON-CLI.md § Finding Workspaces and Items in Fabric	Mandatory — READ link first [needed for finding workspace id by its name or item id by its name, item type, and workspace id]
Fabric Topology & Key Concepts	COMMON-CORE.md § Fabric Topology & Key Concepts
Environment URLs	COMMON-CORE.md § Environment URLs
Authentication & Token Acquisition	COMMON-CORE.md § Authentication & Token Acquisition	Wrong audience = 401; read before any auth issue
Core Control-Plane REST APIs	COMMON-CORE.md § Core Control-Plane REST APIs
Pagination	COMMON-CORE.md § Pagination
Long-Running Operations (LRO)	COMMON-CORE.md § Long-Running Operations (LRO)
Rate Limiting & Throttling	COMMON-CORE.md § Rate Limiting & Throttling
OneLake Data Access	COMMON-CORE.md § OneLake Data Access	Requires `storage.azure.com` token, not Fabric token
Definition Envelope	ITEM-DEFINITIONS-CORE.md § Definition Envelope	Definition payload structure
Per-Item-Type Definitions	ITEM-DEFINITIONS-CORE.md § Per-Item-Type Definitions	Support matrix, decoded content, part paths — REST specs, CLI recipes
Job Execution	COMMON-CORE.md § Job Execution
Capacity Management	COMMON-CORE.md § Capacity Management
Gotchas & Troubleshooting	COMMON-CORE.md § Gotchas & Troubleshooting
Best Practices	COMMON-CORE.md § Best Practices
Tool Selection Rationale	COMMON-CLI.md § Tool Selection Rationale