通过 PySpark 与 Livy 会话交互分析 Lakehouse 数据,支持高级计算与跨仓查询。
复制安装指令,让 AI 自动完成配置 · 推荐新手
请帮我安装 askskill 上的 "spark-consumption-cli" 技能: 1. 下载 https://raw.githubusercontent.com/microsoft/skills-for-fabric/main/skills/spark-consumption-cli/SKILL.md 2. 保存为 ~/.claude/skills/spark-consumption-cli/SKILL.md 3. 装好后重载技能,告诉我可以用了
请使用 PySpark 在 Livy 会话中读取两个 Lakehouse 的销售表和客户表,完成跨库 join,统计各地区近 90 天销售额前 10 的客户,并输出 DataFrame 代码与结果说明。
返回可执行的 PySpark DataFrame 代码、关键计算步骤,以及按地区汇总的结果说明。
请用 Spark SQL 或 PySpark 分析这张 Delta 表昨天版本和当前版本的差异,找出新增、删除和变更记录,并说明如何在 Fabric Lakehouse 中通过时间旅行读取历史版本。
输出历史版本读取方法、差异比对代码,以及新增删除变更的统计说明。
请在 Livy 会话中用 PySpark 读取 Lakehouse 中的 JSON 原始数据,检查缺失字段、类型异常、重复记录和嵌套字段展开后的质量问题,并生成修复建议。
提供 JSON 解析与清洗代码、质量检查结果汇总,以及可执行的修复建议。
Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
- GitHub Copilot CLI / VS Code: invoke the
check-updatesskill.- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
- Skip if the check was already performed earlier in this session.
CRITICAL NOTES
- To find the workspace details (including its ID) from workspace name: list all workspaces and, then, use JMESPath filtering
- To find the item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace and, then, use JMESPath filtering
| Task | Reference | Notes |
|---|---|---|
| Fabric Topology & Key Concepts | COMMON-CORE.md § Fabric Topology & Key Concepts | |
| Environment URLs | COMMON-CORE.md § Environment URLs | |
| Authentication & Token Acquisition | COMMON-CORE.md § Authentication & Token Acquisition | Wrong audience = 401; read before any auth issue |
| Core Control-Plane REST APIs | COMMON-CORE.md § Core Control-Plane REST APIs | |
| Pagination | COMMON-CORE.md § Pagination | |
| Long-Running Operations (LRO) | COMMON-CORE.md § Long-Running Operations (LRO) | |
| Rate Limiting & Throttling | COMMON-CORE.md § Rate Limiting & Throttling | |
| OneLake Data Access | COMMON-CORE.md § OneLake Data Access | Requires storage.azure.com token, not Fabric token |
| Job Execution | COMMON-CORE.md § Job Execution | |
| Capacity Management | COMMON-CORE.md § Capacity Management | |
| Gotchas & Troubleshooting | COMMON-CORE.md § Gotchas & Troubleshooting | |
| Best Practices | COMMON-CORE.md § Best Practices | |
| Tool Selection Rationale | COMMON-CLI.md § Tool Selection Rationale | |
| Finding Workspaces and Items in Fabric | COMMON-CLI.md § Finding Workspaces and Items in Fabric | Mandatory — READ link first [needed for finding workspace id by its name or item id by its name, item type, and workspace id] |
| Authentication Recipes | COMMON-CLI.md § Authentication Recipes | az login flows and token acquisition |
Fabric Control-Plane API via az rest | COMMON-CLI.md § Fabric Control-Plane API via az rest | Always pass --resource https://api.fabric.microsoft.com or az rest fails |
| Pagination Pattern | COMMON-CLI.md § Pagination Pattern | |
| Long-Running Operations (LRO) Pattern | COMMON-CLI.md § Long-Running Operations (LRO) Pattern | |
OneLake Data Access via curl | COMMON-CLI.md § OneLake Data Access via curl | Use curl not az rest (different token audience) |
| SQL / TDS Data-Plane Access | COMMON-CLI.md § SQL / TDS Data-Plane Access | sqlcmd (Go) connect, query, CSV export |
…
帮助将 Databricks 笔记本、作业与数据路径系统迁移到 Microsoft Fabric。
帮助你查询、解析并核查 Microsoft Fabric Eventstream 实时管道配置与状态。
用 PySpark 与 Livy 会话交互分析 Lakehouse 数据并完成高级处理
帮助用户对 Fabric Eventhouse 执行 KQL 查询、分析时序数据并查看表结构与摄取状态。
执行 DAX 查询并检查 Power BI 语义模型元数据,快速获取表、列、度量与关系信息。
帮助用户查询 Fabric Eventhouse 的 KQL 数据,并进行实时与时序分析监控。
使用 PySpark 与 Livy 会话交互分析 Lakehouse 数据并完成高级处理
通过命令行对 Fabric 仓库与 Lakehouse 执行只读 T-SQL 查询、探查与导出数据。
帮助你开发 Microsoft Fabric Spark 工作流、编写调试 Notebook 代码并管理湖仓与资源。
通过命令行对 Fabric 仓库与湖仓数据执行只读 T-SQL 查询、探查与导出。
帮助诊断 Microsoft Fabric 中 Spark 作业失败、会话异常与性能瓶颈。
帮助你开发 Microsoft Fabric Spark 与数据工程流程,编写调试 Notebook 代码并管理湖仓资源。