KEEP IT SANDBOXED functional ~ tested 2026-05-17
// sandboxed in ubuntu 24.04 · aarch64 (via sbx) ·install log · why not fully functional: Doc-only skill with no executable component. Verified structure, cross-references, frontmatter, and content completeness in sandbox. Cannot verify output quality of the skill's advice without building actual agent harnesses.

agents-best-practices

by DenisSergeevitch · https://github.com/DenisSergeevitch/agents-best-practices · MIT · v1.2.0 · updated 2026-05-15

The agent architecture handbook you wish existed before you built your first harness.

4 / 5
quality 4/5
documentation 4/5
setup 5/5
value 4/5
ecosystem fit 4/5
// bottom line

Agents Best Practices is a thorough, well-structured knowledge skill that covers agent harness design end to end. Its 15 reference documents total nearly 14,000 words of concrete, pseudocode-heavy guidance that works across OpenAI, Anthropic, and compatible APIs. The main weakness is that it is purely documentary: no executable helpers, no validation scripts, no scaffolding tooling. For teams designing production agent systems, this is an excellent reference to have loaded.

npx skills add
$npx skills add DenisSergeevitch/agents-best-practices -g
git clone
$git clone https://github.com/DenisSergeevitch/agents-best-practices.git ~/.claude/skills/agents-best-practices

install if

  • Agent developers building on any platform. This is one of the few provider-neutral skill packs. The patterns apply to Claude, GPT, Gemini, and open-source agents equally.
  • Teams shipping their first agent. The MVP blueprint and checklists give a concrete starting point with pass/fail criteria. Useful for avoiding common first-agent mistakes.
  • Engineers who want eval discipline. The evaluation references cover train/test splitting, benchmark aggregation, and iterative improvement loops. Rare in skill packs.

What It Does

Agents Best Practices is a provider-neutral agent skill that teaches AI coding agents how to design, audit, and scaffold agentic harnesses. Rather than targeting a single platform, it provides architecture patterns that work across OpenAI, Anthropic, and any OpenAI-compatible API. When an agent encounters a conversation about building an agent, this skill activates and produces concrete MVP blueprints, tool permission matrices, loop pseudocode, and launch checklists. It covers 15 topic areas including the agentic loop, tool design, context compaction, prompt caching, security evals, and observability.

The Good

Comprehensive reference library. The skill ships 15 focused markdown files totaling 13,887 words of content. Each reference file tackles one topic with concrete pseudocode, not hand-wavy advice. The mvp-agent-blueprint.md alone runs 2,022 words with a full Python loop template, tool registry pattern, and autonomy level taxonomy. The checklists.md reference has pass/fail checkboxes covering MVP design, tools, permissions, context, planning, goals, skills, MCP connectors, and evals.

Provider-neutral stance is useful. Most agent skills assume one ecosystem. This one explicitly shows patterns that map to OpenAI function calling, Anthropic tool use, and compatible APIs in the provider-api-patterns.md reference. The core loop pseudocode stays platform-agnostic while pointing to provider-specific docs when you need them. This is rare and valuable.

Structured SKILL.md with clear activation triggers. The SKILL.md frontmatter is clean, version is declared (1.2.0), and the activation section lists specific intents that should trigger the skill. The reference map tells the agent exactly which file to load for which problem. The "MVP Builder Mode" section gives the agent a default behavior path instead of requiring the user to specify everything.

Zero cross-reference errors. Every single link from SKILL.md and README.md to the references/ directory resolves to an actual file. The sandbox test verified all 25 cross-references pass. No dead links, no stale paths.

Practical philosophy section. The eight non-negotiable principles in SKILL.md are the right ones. "The harness acts, not the model" is the single most important idea in agent safety, and it is stated clearly and repeatedly. The gotchas section correctly warns against broad tools, multi-agent premature optimization, and trusting untrusted data as instructions.

Install instructions cover three paths. The README provides npx skills add for the skills CLI, a copy-paste prompt for AI agents, and manual git clone paths for Codex, Claude Code (user-level and project-level). This covers the major install workflows.

The Bad

Purely documentary, no executable component. This skill contains zero scripts, zero validation tooling, and zero scaffolding generators. Everything is markdown. That is fine for reference, but it means the skill cannot build or audit anything. An agent using this skill can only read and synthesize the guidance. A scaffolding script that generates the MVP blueprint template from the skill's own spec would add significant value.

No tests or CI. The repo has no test infrastructure. No CI workflow, no validation script, no automated check that frontmatter stays valid or cross-references stay intact. For a skill this structured, a simple CI job that verifies SKILL.md frontmatter and checks all reference links would be low effort and high value.

Description frontmatter starts with "Use this skill" not "Use when". The agent skills specification convention is for descriptions to start with "Use when." This skill starts with "Use this skill when." Minor, but it deviates from the spec pattern that agent runtimes may depend on for activation matching.

Coverage audit is thin. The coverage-audit.md reference is only 61 lines and 549 words. For a skill that covers 15 topics, the coverage verification document is surprisingly sparse. It lists topic headings but does not cross-check each topic against actual test scenarios or known edge cases.

No version history or changelog. The skill is at version 1.2.0 but there is no changelog, no release tags, and no commit message convention that would help consumers track what changed between versions. For a reference document that teams may depend on in production, version tracking matters.

Heavy content may bloat context. The SKILL.md alone is 204 lines and 1,580 words. The references total nearly 14,000 words. If an agent loads the wrong combination of references, it can easily consume 5,000+ tokens of context. The skill says "load only the relevant references" but provides no mechanism to enforce this.

Smoke Test Results

Doc-only skill: there's nothing to install in the executable sense, so the smoke test is structural: do the files claimed by the README exist, does the metadata validate, do the cross-references resolve. Ran inside a clean isolated Linux sandbox.

Structural validation

$ verify all 15 reference files present and non-empty
✅ 15/15 present, all > 0 bytes
$ verify SKILL.md frontmatter (name, description, version)
✅ all three fields present and well-formed
$ verify cross-references from SKILL.md and README.md resolve
✅ 25/25 resolve to real files in the repo
$ verify SKILL.md frontmatter size within 1,024-byte spec
✅ 602 bytes (well under limit)
$ verify install instructions present for declared harnesses
✅ Codex, Claude Code, and npx skills CLI all documented
$ verify license declared
✅ MIT

Pass rate: 6 of 6. Skill is structurally sound; everything the README claims is in the repo and references something real.

Full sandbox log →

Setup Walkthrough

  1. Install via the skills CLI (fastest):
 npx skills add DenisSergeevitch/agents-best-practices -g

The -g flag installs globally so every project can discover it.

  1. Or clone manually for Claude Code:
 mkdir -p ~/.claude/skills
 git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
 ~/.claude/skills/agents-best-practices
  1. Or for Codex CLI:
 mkdir -p ~/.codex/skills
 git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
 ~/.codex/skills/agents-best-practices
  1. No build step, no dependencies, no API keys needed. The skill activates automatically when conversation touches agent architecture topics.

Alternatives

  • anthropics/skills -- official Anthropic skills repo with broader coverage but Claude-specific focus.
  • obra/superpowers -- agentic skills framework with hooks, session management, and more opinionated workflows.
  • addyosmani/agent-skills -- production-grade engineering skills from a Google Chrome team engineer, more coding-focused.
// review provenance
reviewed by
GearScope
tested
2026-05-17 · macOS (Apple Silicon)
last verified
2026-05-17
depth
SANDBOXED
sponsorship
none, ever
// share this review
// feedback
was this review helpful?
report stale suggest correction

Want the next one?

Five honest reviews and a verdict you can trust. Every Friday. No spam, no affiliate links.