通过双评审智能体对结果进行对抗式校验,提升输出发布前的可靠性
复制安装指令,让 AI 自动完成配置 · 推荐新手
请帮我安装 askskill 上的 "santa-method" 技能: 1. 下载 https://raw.githubusercontent.com/affaan-m/ECC/main/skills/santa-method/SKILL.md 2. 保存为 ~/.claude/skills/santa-method/SKILL.md 3. 装好后重载技能,告诉我可以用了
请用 santa-method 审查这次代码提交:先让两个独立评审代理分别检查功能正确性、边界条件、回归风险与安全隐患;若结论不一致,进入收敛循环直到两者都通过,再给出是否可发布、风险摘要和修复建议。代码与变更说明如下:
一份包含双重审查结论、分歧收敛结果、发布建议及修复清单的评审报告
请用 santa-method 审核这份产品需求文档:两个独立代理分别从需求完整性、逻辑一致性、用户场景覆盖和潜在歧义角度审查;若任一未通过,持续迭代修正建议,直到双方通过,再输出最终问题清单和修订版建议。文档如下:
一份指出缺漏与歧义、并给出通过双重校验后的修订建议的文档审查结果
请用 santa-method 检查这份研究总结:让两个独立代理分别验证结论是否被证据支持、引用是否准确、推理是否存在漏洞;如有争议,执行收敛循环直到双方都认可,再输出可信度评估和需修正部分。内容如下:
一份包含证据核验、引用审查、可信度评分和修正建议的研究校验报告
Multi-agent adversarial verification framework. Make a list, check it twice. If it's naughty, fix it until it's nice.
The core insight: a single agent reviewing its own output shares the same biases, knowledge gaps, and systematic errors that produced the output. Two independent reviewers with no shared context break this failure mode.
Invoke this skill when:
Do NOT use for internal drafts, exploratory research, or tasks with deterministic verification (use build/test/lint pipelines for those).
┌─────────────┐
│ GENERATOR │ Phase 1: Make a List
│ (Agent A) │ Produce the deliverable
└──────┬───────┘
│ output
▼
┌──────────────────────────────┐
│ DUAL INDEPENDENT REVIEW │ Phase 2: Check It Twice
│ │
│ ┌───────────┐ ┌───────────┐ │ Two agents, same rubric,
│ │ Reviewer B │ │ Reviewer C │ │ no shared context
│ └─────┬─────┘ └─────┬─────┘ │
│ │ │ │
└────────┼──────────────┼────────┘
│ │
▼ ▼
┌──────────────────────────────┐
│ VERDICT GATE │ Phase 3: Naughty or Nice
│ │
│ B passes AND C passes → NICE │ Both must pass.
│ Otherwise → NAUGHTY │ No exceptions.
└──────┬──────────────┬─────────┘
│ │
NICE NAUGHTY
│ │
▼ ▼
[ SHIP ] ┌─────────────┐
│ FIX CYCLE │ Phase 4: Fix Until Nice
│ │
│ iteration++ │ Collect all flags.
│ if i > MAX: │ Fix all issues.
│ escalate │ Re-run both reviewers.
│ else: │ Loop until convergence.
│ goto Ph.2 │
└──────────────┘
Execute the primary task. No changes to your normal generation workflow. Santa Method is a post-generation verification layer, not a generation strategy.
# The generator runs as normal
output = generate(task_spec)
Spawn two review agents in parallel. Critical invariants:
REVIEWER_PROMPT = """
You are an independent quality reviewer. You have NOT seen any other review of this output.
## Task Specification
{task_spec}
## Output Under Review
{output}
## Evaluation Rubric
{rubric}
## Instructions
Evaluate the output against EACH rubric criterion. For each:
- PASS: criterion fully met, no issues
- FAIL: specific issue found (cite the exact problem)
Return your assessment as structured JSON:
{
"verdict": "PASS" | "FAIL",
"checks": [
{"criterion": "...", "result": "PASS|FAIL", "detail": "..."}
],
"critical_issues": ["..."], // blockers that must be fixed
"suggestions": ["..."] // non-blocking improvements
}
Be rigorous. Your job is to find problems, not to approve.
"""
# Spawn reviewers in parallel (Claude Code subagents)
review_b = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer B")
review_c = Agent(prompt=REVIEWER_PROMPT.format(...), description="Santa Reviewer C")
…
帮助你掌握地道 Rust 模式、所有权与并发实践,编写安全高性能应用。
基于 C++ Core Guidelines 编写、审查并重构更安全现代的 C++ 代码。
帮助开发者为代码代理配置性能优化、安全防护与研究优先工作流。
帮助开发者使用 Bun 进行运行、打包、测试与依赖管理,并评估替代 Node 的时机。
追踪Claude Code令牌用量、支出与预算并生成成本报表
提供数据库迁移、回滚与零停机发布的最佳实践指导,适用于多种 ORM 与 SQL 数据库。
通过双评审代理对抗式验证,循环收敛后再交付更可靠结果
从六个关键角色视角审查系统设计,并输出统一风险评估建议
从正确性、测试、安全与性能等维度进行深入代码审查并给出改进建议
模拟多角色工程师协作审查代码,快速发现质量、风险与设计问题。
为 Spring Boot 项目执行构建、测试、安全扫描与变更审查的发布前校验流程
通过生成—核验—修正循环,提高回答、代码或方案的准确性与可靠性。