MeshWorld India Logo MeshWorld.
Cheatsheet Claude Anthropic AI LLM Claude Code API Developer Tools 7 min read

Claude API Cheat Sheet: SDK, CLI, MCP & Prompting

Vishnu
By Vishnu
| Updated: Apr 5, 2026
Claude API Cheat Sheet: SDK, CLI, MCP & Prompting
TL;DR
  • Models: claude-opus-4-6 (200K context, best quality), claude-sonnet-4-6 (200K context, balanced), claude-haiku-4-5 (200K context, fast)
  • Use cache_control: {type: "ephemeral"} to cache large system prompts (90% cheaper)
  • MCP (Model Context Protocol): claude mcp add name url for external tools
  • Batch API: 50% cheaper for bulk processing up to 10,000 requests
  • Claude Code CLI: claude /compact to save context space

Quick reference tables

Models

| Model ID | Context | Best for | |---|---|---| | claude-opus-4-6 | 200K tokens | Complex reasoning, research, long documents | | claude-sonnet-4-6 | 200K tokens | Balanced speed + quality (default choice) | | claude-haiku-4-5-20251001 | 200K tokens | Fast, lightweight, high-volume tasks |

Anthropic API — core requests

| Task | What to use | |---|---| | Chat completion | POST /v1/messages | | Streaming response | stream: true in request body | | Count tokens | POST /v1/messages/count_tokens | | List models | GET /v1/models | | Create a batch | POST /v1/messages/batches |

Messages API — key parameters

| Parameter | Type | What it does | |---|---|---| | model | string | Which Claude model to use | | max_tokens | int | Maximum tokens in response | | messages | array | Conversation history [{role, content}] | | system | string | System prompt | | temperature | float 0–1 | Randomness (0 = deterministic) | | top_p | float 0–1 | Nucleus sampling | | top_k | int | Token sampling pool size | | stop_sequences | array | Strings that stop generation | | tools | array | Tool/function definitions | | tool_choice | object | Force tool use (auto, any, tool) | | stream | bool | Stream tokens as they generate |

Claude Code CLI — essential commands

| Command | What it does | |---|---| | claude | Start interactive REPL | | claude "fix this bug" | One-shot prompt, no REPL | | claude -p "prompt" | Non-interactive, print output | | claude --model claude-opus-4-6 | Use a specific model | | claude --no-stream | Disable streaming | | claude /help | Show available slash commands | | claude /clear | Clear conversation history | | claude /compact | Compact context to save tokens | | claude /commit | Auto-generate and create git commit | | claude /review-pr 123 | Review a pull request | | claude /cost | Show token usage and cost for session | | claude /doctor | Check Claude Code health | | claude /init | Create a CLAUDE.md for this repo |

Claude Code CLI — flags

| Flag | What it does | |---|---| | --model | Specify model ID | | --api-key | Pass API key directly | | --max-tokens | Override max output tokens | | --add-dir /path | Add directory to working context | | --print / -p | Print output without REPL | | --output-format json | JSON output (for scripting) | | --output-format stream-json | Streaming JSON output | | --verbose | Show full tool call details | | --no-stream | Wait for full response | | --dangerously-skip-permissions | Skip tool permission prompts |

MCP — Model Context Protocol

| Command / Concept | What it does | |---|---| | claude mcp add name url | Add an MCP server by URL | | claude mcp add name -- cmd args | Add a local MCP server via stdio | | claude mcp list | List configured MCP servers | | claude mcp remove name | Remove an MCP server | | MCP scope local | Available in current project only | | MCP scope user | Available across all projects | | MCP scope project | Shared via .mcp.json in repo | | CLAUDE.md | Project instructions Claude reads on start |

Token limits (approx.)

| Model | Input limit | Output limit | |---|---|---| | Opus 4.6 | 200K tokens | 32K tokens | | Sonnet 4.6 | 200K tokens | 64K tokens | | Haiku 4.5 | 200K tokens | 8K tokens |

Prompt caching

| Feature | What it does | |---|---| | cache_control: {type: "ephemeral"} | Cache a content block (5-min TTL) | | Cache hit | ~90% cheaper, ~85% faster than full prompt | | Minimum cacheable size | 1024 tokens (Opus/Sonnet), 2048 (Haiku) | | Cached blocks | Tools, system prompt, messages |


Detailed sections

Basic API call (Node.js)

javascript
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const message = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain async/await in JavaScript." }],
});

console.log(message.content[0].text);

Streaming response

javascript
const stream = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  stream: true,
  messages: [{ role: "user", content: "Write a short story." }],
});

for await (const event of stream) {
  if (event.type === "content_block_delta") {
    process.stdout.write(event.delta.text);
  }
}

System prompt + multi-turn conversation

javascript
const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 2048,
  system: "You are a senior backend engineer. Be concise and precise.",
  messages: [
    { role: "user", content: "What's wrong with N+1 queries?" },
    { role: "assistant", content: "N+1 queries happen when..." },
    { role: "user", content: "How do I fix it in PostgreSQL?" },
  ],
});

Tool use (function calling)

javascript
const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  tools: [
    {
      name: "get_weather",
      description: "Get current weather for a city",
      input_schema: {
        type: "object",
        properties: {
          city: { type: "string", description: "City name" },
        },
        required: ["city"],
      },
    },
  ],
  messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
});

// Check if Claude wants to use a tool
if (response.stop_reason === "tool_use") {
  const toolUse = response.content.find((b) => b.type === "tool_use");
  console.log(toolUse.name, toolUse.input); // get_weather { city: 'Tokyo' }
}

Prompt caching — reduce costs on large system prompts

javascript
const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: "You are an expert codebase assistant...\n\n[large context here]",
      cache_control: { type: "ephemeral" }, // cache this block
    },
  ],
  messages: [{ role: "user", content: "Explain the auth module." }],
});
// Subsequent calls with same system block → cache hit → ~90% cheaper

Claude Code — useful workflows

bash
# One-shot: explain code without starting REPL
claude -p "Explain what this does" < src/utils/parser.ts

# Pipe output from another command
git diff | claude -p "Summarize what changed in plain English"

# Use in scripts
SUMMARY=$(claude -p "Summarize this log" < app.log)
echo "$SUMMARY"

# Ask Claude to write tests
claude "Write unit tests for src/auth/login.ts using Vitest"

# Ask Claude to fix a failing test
claude "The test in auth.test.ts is failing — fix it"

# Review a PR
claude /review-pr 42

CLAUDE.md — project instructions

Create CLAUDE.md in your repo root. Claude Code reads it on every session:

markdown
# Project: My App

## Stack
- Node.js 20, TypeScript, Fastify, PostgreSQL
- Tests: Vitest, run with `pnpm test`
- Lint: `pnpm lint` (ESLint + Prettier)

## Conventions
- Use named exports, no default exports
- Prefer `async/await` over `.then()`
- All DB queries go in `src/db/queries/`

## Commands
- `pnpm dev` — start dev server
- `pnpm build` — production build
- `pnpm test` — run tests

Batch API — process many prompts at once

javascript
// Create a batch (async, ~1hr processing)
const batch = await client.messages.batches.create({
  requests: [
    {
      custom_id: "req-1",
      params: {
        model: "claude-haiku-4-5-20251001",
        max_tokens: 256,
        messages: [{ role: "user", content: "Translate: Hello world" }],
      },
    },
    // ... up to 10,000 requests
  ],
});

// Poll for completion
const result = await client.messages.batches.retrieve(batch.id);
console.log(result.processing_status); // "ended" when done

Batch API is ~50% cheaper than individual calls. Good for bulk classification, data extraction, report generation.

Environment setup

bash
# Install SDK
npm install @anthropic-ai/sdk

# Set API key
export ANTHROPIC_API_KEY=sk-ant-...

# Or use .env
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env
python
# Python SDK
pip install anthropic

import anthropic
client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

message = client.messages.create( | [Claude Code Cheatsheet](/blog/cheatsheets/claude-code-cheatsheet/) | [Gemma 4 Local Setup](/blog/ai/tooling/gemma4-local-ollama/)
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

Wondering how Claude stacks up? Read Claude vs Gemini 2.5 for Coding: Honest Comparison.