santa-method

Name: santa-method
Author: affaan-m

通过双评审智能体对结果进行对抗式校验，提升输出发布前的可靠性

星标

★ 231,704

来源

GitHub

更新于

2026-07-21

// 安全评估低风险

仅提示词，不执行代码
开源可审计
社区验证· 231.7k

正在进行安全审计…

凭证密钥
网络外发
代码执行
数据访问
来源供应链

// 安装

复制安装指令，让 AI 自动完成配置 · 推荐新手

请帮我安装 askskill 上的 "santa-method" 技能：
1. 下载 https://raw.githubusercontent.com/affaan-m/ECC/main/skills/santa-method/SKILL.md
2. 保存为 ~/.claude/skills/santa-method/SKILL.md
3. 装好后重载技能，告诉我可以用了

// 下载

下载 SKILL.md机读安装清单 ↗

// 用法示例

发布前审查代码变更

输入

请用 santa-method 审查这次代码提交：先让两个独立评审代理分别检查功能正确性、边界条件、回归风险与安全隐患；若结论不一致，进入收敛循环直到两者都通过，再给出是否可发布、风险摘要和修复建议。代码与变更说明如下：

预期产出

一份包含双重审查结论、分歧收敛结果、发布建议及修复清单的评审报告

核验产品方案文档

输入

请用 santa-method 审核这份产品需求文档：两个独立代理分别从需求完整性、逻辑一致性、用户场景覆盖和潜在歧义角度审查；若任一未通过，持续迭代修正建议，直到双方通过，再输出最终问题清单和修订版建议。文档如下：

预期产出

一份指出缺漏与歧义、并给出通过双重校验后的修订建议的文档审查结果

校对研究结论与引用

输入

请用 santa-method 检查这份研究总结：让两个独立代理分别验证结论是否被证据支持、引用是否准确、推理是否存在漏洞；如有争议，执行收敛循环直到双方都认可，再输出可信度评估和需修正部分。内容如下：

预期产出

一份包含证据核验、引用审查、可信度评分和修正建议的研究校验报告

// 文档

Santa Method

Multi-agent adversarial verification framework. Make a list, check it twice. If it's naughty, fix it until it's nice.

The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode.

When to Activate

Invoke this skill when:

Output will be published, deployed, or consumed by end users
Compliance, regulatory, or brand constraints must be enforced
Code ships to production without human review
Content accuracy matters (technical docs, educational material, customer-facing copy)
Batch generation at scale where spot-checking misses systemic patterns
Hallucination risk is elevated (claims, statistics, API references, legal language)

Do NOT use for internal drafts, exploratory research, or tasks with deterministic verification (use build/test/lint pipelines for those).

Architecture

┌─────────────┐
│  GENERATOR   │  Phase 1: Make a List
│  (Agent A)   │  Produce the deliverable
└──────┬───────┘
       │ output
       ▼
┌──────────────────────────────┐
│     DUAL INDEPENDENT REVIEW   │  Phase 2: Check It Twice
│                                │
│  ┌───────────┐ ┌───────────┐  │  Two agents, same rubric,
│  │ Reviewer B │ │ Reviewer C │  │  no shared context
│  └─────┬─────┘ └─────┬─────┘  │
│        │              │        │
└────────┼──────────────┼────────┘
         │              │
         ▼              ▼
┌──────────────────────────────┐
│        VERDICT GATE           │  Phase 3: Naughty or Nice
│                                │
│  B passes AND C passes → NICE  │  Both must pass.
│  Otherwise → NAUGHTY           │  No exceptions.
└──────┬──────────────┬─────────┘
       │              │
    NICE           NAUGHTY
       │              │
       ▼              ▼
   [ SHIP ]    ┌─────────────┐
               │  FIX CYCLE   │  Phase 4: Fix Until Nice
               │              │
               │ iteration++  │  Collect all flags.
               │ if i > MAX:  │  Fix all issues.
               │   escalate   │  Re-run both reviewers.
               │ else:        │  Loop until convergence.
               │   goto Ph.2  │
               └──────────────┘

Phase Details

Phase 1: Make a List (Generate)

Execute the primary task. No changes to your normal generation workflow. Santa Method is a post-generation verification layer, not a generation strategy.

# The generator runs as normal
output = generate(task_spec)

Phase 2: Check It Twice (Independent Dual Review)

Spawn two review agents in parallel. Critical invariants:

Context isolation — neither reviewer sees the other's assessment
Identical rubric — both receive the same evaluation criteria
Same inputs — both receive the original spec AND the generated output
Structured output — each returns a typed verdict, not prose

REVIEWER_PROMPT = """
You are an independent quality reviewer. You have NOT seen any other review of this output.

## Task Specification
{task_spec}

## Output Under Review
{output}

## Evaluation Rubric
{rubric}

## Instructions
Evaluate the output against EACH rubric criterion. For each:
- PASS: criterion fully met, no issues
- FAIL: specific issue found (cite the exact problem)

Return your assessment as structured JSON:
{
  "verdict": "PASS" | "FAIL",
  "checks": [
    {"criterion": "...", "result": "PASS|FAIL", "detail": "..."}
  ],
  "critical_issues": ["..."],   // blockers that must be fixed
  "suggestions": ["..."]         // non-blocking improvements
}

Be rigorous. Your job is to find problems, not to approve.
"""

# Spawn reviewers in parallel (Claude Code subagents)
review_b = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer B")
review_c = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer C")

…

查看完整文档 ↗