KEEP IT HANDS-ON functional ~ tested 2026-05-22

// sandboxed in n/a ·install log · why not fully functional: All 23 skills pass structural validation (validate-skills.js). The plugin manifest validates against Claude Code's plugin schema. No functional testing beyond structural validation is possible because 22 of 23 skills are doc-only workflows (no scripts, no CLI). The idea-refine skill has a helper script but it only creates a directory. ·functional log

Addy Osmani Agent Skills

Item: Addy Osmani Agent Skills
Rating: 4
Author: GearScope

by Addy Osmani (Google Chrome team) · https://github.com/addyosmani/agent-skills · MIT · vv1.0.0 · updated 2026-05-21

Addy Osmani ships a full SDLC methodology as portable agent skills, and the craft shows.

⚙ ⚙ ⚙ ⚙ ⚙ 4 / 5

quality 5/5

documentation 5/5

setup 4/5

value 4/5

ecosystem fit 5/5

// bottom line

This is the most polished agent skill pack I have reviewed. Twenty-three skills, three specialist personas, seven slash commands, four reference checklists, session hooks, a CI-validated plugin manifest, and per-platform setup docs for seven agent environments. The engineering culture is embedded in every skill: anti-rationalization tables, verification gates, and the "process, not prose" design philosophy produce skills that agents actually follow rather than skim. The only real gap is that 22 of 23 skills are doc-only workflows with no executable scripts, which limits smoke-testable surface area. The rating reflects both the exceptional quality of the content and the reality that a framework this large asks agents to absorb significant methodology before shipping value.

Claude Code (marketplace)

$/plugin marketplace add addyosmani/agent-skills && /plugin install agent-skills@addy-agent-skills

Claude Code (local)

$git clone https://github.com/addyosmani/agent-skills.git && claude --plugin-dir /path/to/agent-skills

Gemini CLI

$gemini skills install https://github.com/addyosmani/agent-skills.git --path skills

install if

Senior engineers who want their AI agents to follow disciplined SDLC processes. The methodology is grounded in real Google engineering culture (Hyrum's Law, Beyonce Rule, change sizing, trunk-based development). If you already practice these things yourself, this pack makes your agent do the same.
Teams standardizing on Claude Code as their primary agent. The plugin marketplace install, slash commands, session hooks, and agent personas make this a first-class Claude Code experience. The /ship command alone is worth the install if your team does pre-merge reviews.
Developers building web applications with React/Next.js/TypeScript stacks. The examples, security patterns, and performance guidance all assume web frontend and Node.js backend. You get the most value if that matches your stack.

skip if

Developers working in non-web domains. If you build embedded firmware, game engines, data pipelines, or mobile native apps, the concrete examples will not map to your work. The conceptual framework is still sound, but you will be adapting constantly.
Teams that want enforceable process, not guidelines. Because 22 of 23 skills are pure markdown, the framework cannot stop an agent from skipping steps. If you need CI-enforced gates, you need a different approach or you need to write the enforcement scripts yourself.
Solo developers who already have a working agent workflow. If you have already internalized spec-plan-build-test-ship and your agent follows it, the 23 skills may add more context than value. The framework shines most in team settings where consistency matters.

What It Does

Agent Skills is a 23-skill framework that encodes the full software development lifecycle for AI coding agents. Created by Addy Osmani (Google Chrome team, author of "Learning Patterns" and "Image Optimization"), the pack covers every phase from initial idea refinement through spec writing, planning, incremental implementation, testing, code review, security hardening, performance optimization, and shipping to production. Each skill is a structured workflow with steps, verification gates, anti-rationalization tables, and red flags. The framework targets developers who want their AI agents to follow the same disciplined processes that senior engineers use, rather than defaulting to the shortest path. It ships as a Claude Code plugin (via marketplace), with setup guides for Cursor, Gemini CLI, Windsurf, OpenCode, Copilot, and Kiro.

The Good

Structural validation is CI-enforced and comprehensive. The scripts/validate-skills.js script checks every skill for valid YAML frontmatter, name-to-directory matching, description length under 1024 characters, required sections (Overview, When to Use, Common Rationalizations, Red Flags, Verification), and dead cross-skill references. All 23 skills pass with zero errors and zero warnings. The GitHub Actions workflow runs this validation on every push and PR, then validates the plugin manifest against Claude Code's schema, then does an end-to-end install test. This is the strongest CI pipeline I have seen in an agent skill repo.

Anti-rationalization tables are a genuine innovation. Every skill includes a "Common Rationalizations" section that lists the excuses agents use to skip steps (for example, "This is too small for a spec" or "I will add tests later") paired with factual counter-arguments. This is not a cosmetic feature. It directly addresses the biggest problem with agent skills: agents rationalize their way out of following the process. The doubt-driven-development skill takes this further with a structured CLAIM-EXTRACT-DOUBT-RECONCILE-STOP cycle. I have not seen this pattern in any other skill pack.

The slash-command orchestration model is well-designed. Seven commands (/spec, /plan, /build, /test, /review, /code-simplify, /ship) map development phases to the right skills. The /ship command stands out: it uses a parallel fan-out pattern that spawns code-reviewer, security-auditor, and test-engineer personas concurrently, then merges their reports into a go/no-go decision with a mandatory rollback plan. The orchestration-patterns reference doc explicitly documents which patterns are endorsed and which are anti-patterns (no router personas, no nested subagents). This is the most mature multi-agent orchestration model I have seen packaged as skills.

Progressive disclosure respects context windows. The meta-skill (using-agent-skills) is the entry point. It loads only the discovery flowchart at session start, then references individual skills by name. Reference checklists (testing-patterns.md at 236 lines, security-checklist.md at 134 lines, performance-checklist.md at 153 lines, accessibility-checklist.md at 160 lines) are separate files that load on demand. The SKILL.md files range from 178 to 390 lines, with a median of about 300 lines. This is below the 500-line target stated in the skill-anatomy doc, which is a good sign that the author practices what the framework preaches.

Cross-platform setup docs cover seven agent environments. Dedicated setup guides exist for Claude Code (58 lines), Cursor (58 lines), Gemini CLI (131 lines), Windsurf (48 lines), OpenCode (178 lines), Copilot (82 lines), and Kiro (inline in README). The OpenCode guide is the most detailed, mapping the lifecycle to OpenCode's skill tool and explicitly documenting which Claude Code features have no OpenCode equivalent. This level of platform-specific adaptation is unusual and valuable.

The Bad

Twenty-two of 23 skills are doc-only with no executable scripts. Only idea-refine ships a script, and it merely creates a docs/ideas directory. Every other skill is pure markdown. This means the framework cannot verify that an agent actually followed the workflow. The verification sections say things like "Run npm test and verify all tests pass," but there is no script that checks whether the agent did this. For teams that want enforcement rather than guidance, this is a gap. Compare with obra/superpowers, which includes runnable validation scripts in several skills.

The scope is overwhelmingly web-centric. The security skill leads with OWASP Top 10. The performance skill focuses on Core Web Vitals. The frontend skill assumes React-like component architectures. The CI/CD skill references npm and Node.js tooling. If you are building embedded systems, data pipelines, game servers, or desktop applications, most of the concrete examples will not map. The conceptual framework (spec before code, test before ship, review before merge) is universal, but the implementation details assume a web application stack.

No tests for the one script that exists. The validate-skills.js validator is itself untested. There is no test file that validates the validator. The idea-refine.sh script has no corresponding test. The session hooks have a test (session-start-test.sh), which is good, but the coverage is thin. For a framework that emphasizes verification and proof, the lack of self-testing is a noticeable gap.

Smoke Test Results

Testing was done by cloning the repo and running the built-in validator on the host machine (macOS, aarch64). No sandbox was used because the skill type is framework/doc-only with no executable surface beyond the validation script.

Structural validation (validate-skills.js)

$ cd /tmp/addyosmani-agent-skills && node scripts/validate-skills.js
 ✓ api-and-interface-design
 ✓ browser-testing-with-devtools
 ✓ ci-cd-and-automation
 ✓ code-review-and-quality
 ✓ code-simplification
 ✓ context-engineering
 ✓ debugging-and-error-recovery
 ✓ deprecation-and-migration
 ✓ documentation-and-adrs
 ✓ doubt-driven-development
 ✓ frontend-ui-engineering
 ✓ git-workflow-and-versioning
 ✓ idea-refine (section checks exempt)
 ✓ incremental-implementation
 ✓ interview-me
 ✓ performance-optimization
 ✓ planning-and-task-breakdown
 ✓ security-and-hardening
 ✓ shipping-and-launch
 ✓ source-driven-development
 ✓ spec-driven-development
 ✓ test-driven-development
 ✓ using-agent-skills (section checks exempt)

23 skills checked: 0 error(s), 0 warning(s) - PASSED

Pass rate: 23 of 23. All skills pass structural validation. Two skills (idea-refine and using-agent-skills) are exempt from section checks, with documented reasons in the validator source code.

Plugin manifest validation

$ cat .claude-plugin/plugin.json | python3 -m json.tool
{
 "name": "agent-skills",
 "description": "Production-grade engineering skills...",
 "version": "1.0.0",
 "author": { "name": "Addy Osmani" },
 "homepage": "https://github.com/addyosmani/agent-skills",
 "license": "MIT",
 "commands": "./.claude/commands",
 "skills": "./skills",
 "agents": [
 "./agents/code-reviewer.md",
 "./agents/security-auditor.md",
 "./agents/test-engineer.md"
 ]
}

Pass rate: 1 of 1. The plugin manifest is valid JSON, references real directories and files, and the marketplace.json correctly registers the plugin for Claude Code's marketplace.

What the runs tell you

The structural validation proves every skill has valid frontmatter, matching names, compliant descriptions, and the required section anatomy. The plugin manifest is structurally sound. What cannot be validated is whether an agent actually follows the workflow. The framework provides guidance and gates, but it relies on the agent's compliance, which is a fundamental limitation of doc-only skill packs.

Setup Walkthrough

Clone the repo: git clone https://github.com/addyosmani/agent-skills.git
For Claude Code, install from marketplace: /plugin marketplace add addyosmani/agent-skills then /plugin install agent-skills@addy-agent-skills. Alternatively, run locally: claude --plugin-dir /path/to/agent-skills
For Gemini CLI: gemini skills install https://github.com/addyosmani/agent-skills.git --path skills
For other agents, see the platform-specific docs in docs/.

No post-install gotchas. The session-start hook requires jq for meta-skill injection but degrades gracefully if jq is missing.

Alternatives

obra/superpowers -- the 202K-star agentic skills framework with broader scope, runnable validation scripts, and more opinionated methodology. Better for teams that want enforcement, not just guidance.
google/skills -- Google's official agent skills, which are narrower (Google Cloud and Gemini API) but backed by a vendor with deep infrastructure expertise. Better if you are in the Google ecosystem.
github/awesome-copilot -- GitHub's community-contributed Copilot skills and configurations. Broader community input but less structured methodology. Better for Copilot-specific workflows.

// review provenance

reviewed by: GearScope
tested: 2026-05-22 · macOS (Apple Silicon)
last verified: 2026-05-22
depth: HANDS-ON
sponsorship: none, ever

report stale suggest correction

← previous

first review

agents-best-practices

Want the next one?

Five honest reviews and a verdict you can trust. Every Friday. No spam, no affiliate links.