Writing Effective Claude Skills
Claude skills are reusable, prompt-based automation shortcuts stored as `SKILL.md` files with YAML frontmatter and Markdown instructions. Best practices: narrow scope (one job per skill), specific instructions with examples, separate process (SKILL.md) from knowledge (REFERENCE.md), use XML tags for structure, control invocation via `disable-model-invocation` and `user-invocable` flags, and iterate based on real outputs.
So, What Are Claude Skills?
Think of a Claude skill as a saved recipe for a task you keep asking Claude to do. Instead of re-explaining your requirements every single time, you package those instructions into a named shortcut — and from then on, Claude just... does it right, every time. In Claude Code and Cowork, each skill lives in its own folder as a file called SKILL.md, with any helper files alongside it.
When a skill gets invoked, Claude reads the SKILL.md and follows the process you've defined there. Because everything is written down, the skill works the same way whether it's you, your teammate, or a future version of Claude running it.
Where Do Skills Live?
There are three places a skill can call home:
Personal skills sit in ~/.claude/skills/<skill-name>/SKILL.md and follow you everywhere across all your projects. Project skills live inside a specific repo at .claude/skills/<skill-name>/SKILL.md — they only show up when you're in that project. Plugin skills are bundled for sharing, and they get a namespaced label like plugin-name:skill-name so people know where they came from.
If you're in a larger organization, admins can also distribute skills centrally so everyone gets them automatically.
How Do You Actually Trigger a Skill?
Two ways. You can type a slash command directly in chat — something like /commit or /review-pr. Or Claude can figure out on its own that a skill applies to what you're working on and invoke it automatically. Which way works depends on how you wrote the SKILL.md frontmatter: the description field is what Claude reads to decide whether to auto-trigger, and user-invocable: false hides a skill from the slash-command menu entirely (so only Claude can call it, not you).
If you set disable-model-invocation: true, the skill only runs when you explicitly call it. That's the right call for anything high-stakes — deploying code, sending messages, that kind of thing. You don't want Claude deciding on its own that now seems like a good time to push to production.
What's Inside a Skill Directory?
The only required file is SKILL.md — that's the one with your instructions. But a well-built skill usually has a few supporting files too:
REFERENCE.mdfor detailed domain knowledge, rules, or background contextexamples/for sample inputs and outputs showing Claude what "good" looks likescripts/for helper scripts (Python, bash, whatever) the skill might need to run
The key thing to know: supporting files aren't loaded automatically. Claude only reads them when your SKILL.md tells it to. This keeps things fast and token-efficient — you're not paying to load a 3,000-word style guide on every invocation if you don't need it.
How to Write a SKILL.md File
A SKILL.md has two parts: a YAML block at the top (wrapped in --- markers) that handles configuration, and a Markdown body below it with the actual instructions. Both matter. Neither one alone is enough.
The YAML Frontmatter
This part gets parsed by the runtime and controls how the skill behaves. The fields you'll use most:
name— the display name and the slug for the slash command (name: commitgives you/commit)description— a plain-English description of what the skill does and when to use it; Claude reads this to decide when to auto-trigger the skill, so it matters more than you'd thinkargument-hint— a short placeholder shown after the command, like[filename]or[branch-name]allowed-tools— tools Claude can use without asking permission each time (e.g.Read Write Bash)disable-model-invocation— set this totruefor anything that should only run on explicit commanduser-invocable— set tofalseif you want the skill to be background context only, never a slash commandcontext— set toforkto run the skill in its own isolated context windowmodel— pin a specific Claude version if consistency across upgrades matters to youeffort— override the session effort level (low,medium,high, ormax)
Here's what a clean frontmatter looks like:
---
name: review-pr
description: >
Review a GitHub pull request for correctness, style, and potential bugs.
Use when the user says 'review PR', 'check this pull request', or
pastes a GitHub PR URL.
argument-hint: [PR number or URL]
allowed-tools: Bash Read WebFetch
---
The Markdown Body
This is your step-by-step recipe. Think of it as instructions you're writing for a very capable colleague who's never seen your project before. Every step should be concrete, ordered, and leave nothing ambiguous.
A well-written body typically has: a short summary of what the skill does, a numbered list of steps, instructions for handling edge cases, pointers to any supporting files, and a clear spec for what the output should look like (format, length, tone, filename — whatever applies).
Keep the body somewhere in the 1,500–2,000 word range. Longer files slow down invocations because the whole thing loads into context every time. If you need to include a lot of detail, move it to a reference file and point to it.
Dynamic Placeholders
Skills support a handful of runtime variables you can drop into your instructions:
$ARGUMENTS— everything the user typed after the slash command$0,$1,$2— individual positional arguments${CLAUDE_SESSION_ID}— the current session ID${CLAUDE_SKILL_DIR}— the absolute path to the skill's directory
You can also inject shell commands using backtick syntax (e.g. !`git log --oneline -10`) — Claude runs the command before reading the prompt and substitutes the output inline.
How to Write Prompts That Actually Work
The quality of a skill comes almost entirely down to the quality of the prompt. Here's what makes the difference.
One Skill, One Job
Before you write a single line, try naming two or three specific real-world scenarios where you'd use this skill. If you can't, the scope is too broad. Skills that try to do everything produce inconsistent results because Claude can't reliably pick the right path through vague instructions.
"Review code" is too broad. "Review a Python function for edge-case handling and suggest test cases" is appropriately narrow. If you find yourself using the word "or" a lot in your description, split it into two skills.
Be Specific
Vague instructions are the number-one source of bad skill output. Tell Claude exactly what format you want, how long the output should be, what tone to use, and what's out of scope. Write affirmative instructions — "write in flowing prose paragraphs with no headers" is stronger than "don't use bullet points."
Explain the Why
Claude doesn't just execute instructions mechanically — it reasons about them. When you explain why something matters, Claude can apply the underlying logic to situations you didn't explicitly cover. A commit message skill that says "keep subject lines under 72 characters because many terminal pagers truncate at that width" will outperform one that just says "keep it short."
Use Examples
Few-shot examples are one of the most reliable prompt engineering techniques out there. Three to five well-chosen examples can dramatically improve output quality, especially for tasks with formatting or style requirements. Wrap them in XML tags to keep them separate from instructions:
<examples>
<example>
<input>Fix the off-by-one error in the loop</input>
<output>fix: correct off-by-one error in pagination loop</output>
</example>
<example>
<input>Add dark mode support to the settings page</input>
<output>feat(settings): add dark mode toggle with CSS variables</output>
</example>
</examples>
Good examples are diverse enough to cover different cases but consistent enough to reveal a clear pattern. Inconsistent examples actively make things worse — they teach Claude an unintended pattern.
Use XML Tags for Structure
XML tags give Claude clear boundaries within your prompt. Tags like <instructions>, <context>, <input>, and <output_format> help Claude understand which part is which. For skills that work on long documents, <document> and <document_content> tags mirror Claude's own training patterns and improve how well it retrieves the relevant content. And if you're including a lot of document content, put it above your instructions in the file — that can improve performance noticeably.
Set a Role
One sentence of role framing changes a lot. "You are a senior software engineer reviewing code for production readiness" shifts the tone, depth, and focus of the output compared to a generic request. Be specific about the register: "Write with the precision of a technical editor who values brevity" is more useful than "Use a professional tone."
Ask Clarifying Questions
Skills that pause to ask two or three targeted questions before running consistently produce better output. The questions show Claude has understood the task structure, and they give the user a chance to redirect before a lot of work happens. This is especially important for skills that write files or make changes that are hard to undo. Frame questions as multiple choice whenever possible — open-ended questions create more uncertainty, not less.
Specify the Output Format
Never assume Claude knows what format you want. Say explicitly whether you want plain prose, structured markdown, JSON, a specific filename, or whatever else. If the output is a file, say so — including the name, path, and whether it should overwrite existing content.
Keeping Skills Reusable and Reliable
Separate Process From Knowledge
Keep the steps in SKILL.md and the context in supporting reference files. This keeps your skill file lean and fast to load, and lets you update your style guide or domain rules without touching the workflow logic.
Write Defensively
Good skills anticipate failure. If a required file might be missing, tell Claude to report that clearly rather than invent a substitute. If an API call might fail, specify the fallback. If an input might be ambiguous, tell Claude to ask rather than guess. And if you know Claude tends to over-explain on a particular type of task, just ban it directly: "Do not add explanatory preamble or closing summaries. Produce only the requested output."
Design for Composability
The most powerful skills are designed to work together. If a skill produces a structured intermediate output — say, a JSON summary of code review findings — it can feed into a downstream skill that drafts GitHub issue tickets from that JSON. When you're designing a skill, think about what format its output would need to be in to be machine-readable downstream.
Control Who Can Trigger It
- Set
disable-model-invocation: truefor anything involving irreversible external actions — sending emails, deploying code, writing to production databases. - Set
user-invocable: falsefor skills that serve as background knowledge modules. - Leave both at their defaults for low-stakes interactive skills where it doesn't matter who initiates them.
Test and Iterate
Treat prompt engineering like any other empirical process. The first version rarely produces optimal results. Run the skill against real inputs, review what comes back, and refine based on what you see. Keep a log of test cases and expected outputs as the skill evolves — especially for anything in a production workflow. That way, when you change the prompt, you can check whether it broke something that was already working.
Common Pitfalls
| Do this | Not this |
|---|---|
| Write precise instructions with examples | Vague prompts like "help me with my code" |
| Use specific register: "Write with the brevity of a newspaper subheading" | Generic tone words: "Use a professional tone" |
| Affirmative framing: "Write in plain prose with no headers" | Negative-only constraints: "Don't use bullet points" |
| Keep SKILL.md to process steps; put domain knowledge in REFERENCE.md | Dumping everything into SKILL.md |
| Natural language: "Use this tool when the user mentions X" | Aggressive caps: "CRITICAL: You MUST ALWAYS..." |
3–5 diverse, consistent examples in <example> tags | Inconsistent examples that teach the wrong pattern |
| Build for general inputs; ask when unclear | Hard-code logic for specific test cases |
| Test against real inputs; track cases over time | Ship without testing |
Over-engineering is a real problem. Sprawling SKILL.md files with branching logic and extensive feature coverage that rarely gets used make skills harder to maintain and more likely to confuse Claude. Start with the minimum viable version. Add complexity only when a specific real-world failure demands it.
Neglecting the description is the other big one. Because Claude uses this field to decide when to auto-trigger, a vague description leads to both under-triggering (Claude misses it when it should apply) and over-triggering (Claude invokes it in wrong contexts). A good description includes concrete trigger phrases, the type of input expected, and the type of output produced.
Good vs. Poor Skill Prompts: Side by Side
Commit Messages
Poor: Write a commit message for the changes.
No format, no convention, no constraints. Claude will produce something, but it'll vary every time and won't match your project standards.
Better:
You are a software engineer following the Conventional Commits specification.
Review the output of `git diff --staged` and write a commit message that:
1. Has a subject line in the imperative mood, under 72 characters
2. Uses one of these types: feat, fix, docs, style, refactor, test, chore
3. Includes a blank line between subject and body
4. Describes *why* the change was made, not just what changed
Output only the commit message text. Do not include any explanation.
This works because it names a specific standard, gives concrete rules with reasoning, and nails down exactly what the output should be.
Code Review
Poor: Review my code and tell me if there are any issues.
"Issues" is undefined. Claude might fixate on minor style preferences, miss a security bug, or produce something too terse or too long to be useful.
Better:
You are a senior engineer conducting a pre-merge code review.
Review the provided code for the following concerns, in this priority order:
1. Security vulnerabilities (injection, auth bypass, data exposure)
2. Correctness bugs (off-by-one errors, null/undefined handling, race conditions)
3. Performance issues (N+1 queries, unbounded loops, unnecessary allocations)
4. Readability (naming, complexity, missing docstrings on public functions)
For each issue found:
- State the category and severity (critical / major / minor)
- Quote the specific line(s) involved
- Explain why it is a problem
- Suggest a concrete fix
If no issues are found in a category, omit that category. Do not pad the
response with compliments or summaries. End with a one-line verdict:
APPROVE, REQUEST CHANGES, or NEEDS DISCUSSION.
Every ambiguity in the poor version is gone. Scope is defined, priority is clear, output format is specified, and the closing is constrained to a single line.
Summarization
Poor: Summarize the document.
Without a defined length, audience, or purpose, summaries can range from one sentence to multiple pages.
Better:
You are a research analyst summarizing a source document for a busy executive
who needs to decide whether the document requires their attention.
Produce a structured summary with exactly three sections:
1. TLDR (2 sentences): What this document is and its single most important point
2. Key Findings (3-5 bullet points): The most consequential facts or conclusions
3. Action Required (1 sentence or 'No action required'): What, if anything, the
reader needs to decide or do based on this document
Total length: no more than 200 words. Use plain language; avoid jargon.
Ground every claim in the source document. Do not add interpretation.
Fixed structure, hard word limit, defined audience, binary final section. The key decisions are made in the prompt, not left to Claude — which is exactly why this produces consistent output across completely different source documents.