Claude API Cheat Sheet: SDK, CLI, MCP & Prompting

Quick reference tables

Models

Model ID	Context	Best for
`claude-opus-4-6`	200K tokens	Complex reasoning, research, long documents
`claude-sonnet-4-6`	200K tokens	Balanced speed + quality (default choice)
`claude-haiku-4-5-20251001`	200K tokens	Fast, lightweight, high-volume tasks

Anthropic API — core requests

Task	What to use
Chat completion	`POST /v1/messages`
Streaming response	`stream: true` in request body
Count tokens	`POST /v1/messages/count_tokens`
List models	`GET /v1/models`
Create a batch	`POST /v1/messages/batches`

Messages API — key parameters

Parameter	Type	What it does
`model`	string	Which Claude model to use
`max_tokens`	int	Maximum tokens in response
`messages`	array	Conversation history `[{role, content}]`
`system`	string	System prompt
`temperature`	float 0–1	Randomness (0 = deterministic)
`top_p`	float 0–1	Nucleus sampling
`top_k`	int	Token sampling pool size
`stop_sequences`	array	Strings that stop generation
`tools`	array	Tool/function definitions
`tool_choice`	object	Force tool use (`auto`, `any`, `tool`)
`stream`	bool	Stream tokens as they generate

Claude Code CLI — essential commands

Command	What it does
`claude`	Start interactive REPL
`claude "fix this bug"`	One-shot prompt, no REPL
`claude -p "prompt"`	Non-interactive, print output
`claude --model claude-opus-4-6`	Use a specific model
`claude --no-stream`	Disable streaming
`claude /help`	Show available slash commands
`claude /clear`	Clear conversation history
`claude /compact`	Compact context to save tokens
`claude /commit`	Auto-generate and create git commit
`claude /review-pr 123`	Review a pull request
`claude /cost`	Show token usage and cost for session
`claude /doctor`	Check Claude Code health
`claude /init`	Create a CLAUDE.md for this repo

Claude Code CLI — flags

Flag	What it does
`--model`	Specify model ID
`--api-key`	Pass API key directly
`--max-tokens`	Override max output tokens
`--add-dir /path`	Add directory to working context
`--print` / `-p`	Print output without REPL
`--output-format json`	JSON output (for scripting)
`--output-format stream-json`	Streaming JSON output
`--verbose`	Show full tool call details
`--no-stream`	Wait for full response
`--dangerously-skip-permissions`	Skip tool permission prompts

MCP — Model Context Protocol

Command / Concept	What it does
`claude mcp add name url`	Add an MCP server by URL
`claude mcp add name -- cmd args`	Add a local MCP server via stdio
`claude mcp list`	List configured MCP servers
`claude mcp remove name`	Remove an MCP server
MCP scope `local`	Available in current project only
MCP scope `user`	Available across all projects
MCP scope `project`	Shared via `.mcp.json` in repo
`CLAUDE.md`	Project instructions Claude reads on start

Token limits (approx.)

Model	Input limit	Output limit
Opus 4.6	200K tokens	32K tokens
Sonnet 4.6	200K tokens	64K tokens
Haiku 4.5	200K tokens	8K tokens

Prompt caching

Feature	What it does
`cache_control: {type: "ephemeral"}`	Cache a content block (5-min TTL)
Cache hit	~90% cheaper, ~85% faster than full prompt
Minimum cacheable size	1024 tokens (Opus/Sonnet), 2048 (Haiku)
Cached blocks	Tools, system prompt, messages

Detailed sections

Basic API call (Node.js)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const message = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain async/await in JavaScript." }],
});

console.log(message.content[0].text);

Streaming response

const stream = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  stream: true,
  messages: [{ role: "user", content: "Write a short story." }],
});

for await (const event of stream) {
  if (event.type === "content_block_delta") {
    process.stdout.write(event.delta.text);
  }
}

System prompt + multi-turn conversation

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 2048,
  system: "You are a senior backend engineer. Be concise and precise.",
  messages: [
    { role: "user", content: "What's wrong with N+1 queries?" },
    { role: "assistant", content: "N+1 queries happen when..." },
    { role: "user", content: "How do I fix it in PostgreSQL?" },
  ],
});

Tool use (function calling)

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  tools: [
    {
      name: "get_weather",
      description: "Get current weather for a city",
      input_schema: {
        type: "object",
        properties: {
          city: { type: "string", description: "City name" },
        },
        required: ["city"],
      },
    },
  ],
  messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
});

// Check if Claude wants to use a tool
if (response.stop_reason === "tool_use") {
  const toolUse = response.content.find((b) => b.type === "tool_use");
  console.log(toolUse.name, toolUse.input); // get_weather { city: 'Tokyo' }
}

Prompt caching — reduce costs on large system prompts

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: "You are an expert codebase assistant...\n\n[large context here]",
      cache_control: { type: "ephemeral" }, // cache this block
    },
  ],
  messages: [{ role: "user", content: "Explain the auth module." }],
});
// Subsequent calls with same system block → cache hit → ~90% cheaper

Claude Code — useful workflows

# One-shot: explain code without starting REPL
claude -p "Explain what this does" < src/utils/parser.ts

# Pipe output from another command
git diff | claude -p "Summarize what changed in plain English"

# Use in scripts
SUMMARY=$(claude -p "Summarize this log" < app.log)
echo "$SUMMARY"

# Ask Claude to write tests
claude "Write unit tests for src/auth/login.ts using Vitest"

# Ask Claude to fix a failing test
claude "The test in auth.test.ts is failing — fix it"

# Review a PR
claude /review-pr 42

CLAUDE.md — project instructions

Create CLAUDE.md in your repo root. Claude Code reads it on every session:

# Project: My App

## Stack
- Node.js 20, TypeScript, Fastify, PostgreSQL
- Tests: Vitest, run with `pnpm test`
- Lint: `pnpm lint` (ESLint + Prettier)

## Conventions
- Use named exports, no default exports
- Prefer `async/await` over `.then()`
- All DB queries go in `src/db/queries/`

## Commands
- `pnpm dev` — start dev server
- `pnpm build` — production build
- `pnpm test` — run tests

Batch API — process many prompts at once

// Create a batch (async, ~1hr processing)
const batch = await client.messages.batches.create({
  requests: [
    {
      custom_id: "req-1",
      params: {
        model: "claude-haiku-4-5-20251001",
        max_tokens: 256,
        messages: [{ role: "user", content: "Translate: Hello world" }],
      },
    },
    // ... up to 10,000 requests
  ],
});

// Poll for completion
const result = await client.messages.batches.retrieve(batch.id);
console.log(result.processing_status); // "ended" when done

Batch API is ~50% cheaper than individual calls. Good for bulk classification, data extraction, report generation.

Environment setup

# Install SDK
npm install @anthropic-ai/sdk

# Set API key
export ANTHROPIC_API_KEY=sk-ant-...

# Or use .env
echo "ANTHROPIC_API_KEY=sk-ant-..." >> .env

# Python SDK
pip install anthropic

import anthropic
client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from env

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

Wondering how Claude stacks up? Read Claude vs Gemini 2.5 for Coding: Honest Comparison.

System_Continuity

Next_Recommended_Node

OpenAI API Cheat Sheet: GPT-4o, Tools & Assistants

Complete OpenAI reference — GPT-4o and o-series models, chat completions, function calling, structured output, Assistants API, embeddings, DALL-E 3, and Whisper.

Vishnu

5m read

Cheatsheet 5m

Gemini API Cheat Sheet: 2.5 Pro, Vision & Grounding

Log_Access

AI 5m

OpenAI Codex & Agents Cheatsheet (2026 Edition)

Log_Access

Browse the full manifest