Clawist
🟡 Intermediate13 min readBy Lin6

Using Claude API with OpenClaw: Complete Configuration Guide

Claude's API is the brain behind OpenClaw—the AI model that powers every conversation, task, and decision your agent makes. Understanding how to configure, optimize, and troubleshoot Claude API integration is essential for running a production-ready OpenClaw instance.

This guide covers everything from initial setup to advanced configuration, cost optimization, and best practices for using Claude API at scale.

Understanding the Claude API

Anthropic's Claude API provides programmatic access to their language models. Unlike ChatGPT's web interface, the API is designed for building AI applications—exactly what OpenClaw does.

Available Models (2026)

Claude Opus 4 (claude-opus-4)

  • Best for: Complex reasoning, coding, analysis, creative writing
  • Context window: 200,000 tokens (~150,000 words)
  • Cost: $15 input / $75 output per million tokens
  • Speed: ~80 tokens/second
  • Use when: Quality matters more than cost

Claude Sonnet 4 (claude-sonnet-4)

  • Best for: Balanced performance, general tasks, routine automation
  • Context window: 200,000 tokens
  • Cost: $3 input / $15 output per million tokens
  • Speed: ~120 tokens/second
  • Use when: You need good quality at reasonable cost (default choice)

Claude Haiku 4 (claude-haiku-4)

  • Best for: Simple tasks, quick responses, high-volume automation
  • Context window: 200,000 tokens
  • Cost: $0.25 input / $1.25 output per million tokens
  • Speed: ~200 tokens/second
  • Use when: Speed and cost matter more than nuance

How OpenClaw Uses the API

Every time your OpenClaw agent thinks or responds, it:

  1. Loads context (memory files, recent messages, tool definitions)
  2. Sends this context + user message to Claude API
  3. Receives Claude's response (text + tool calls)
  4. Executes any requested tool calls
  5. Sends results back to Claude
  6. Repeats until task is complete

This back-and-forth can involve 3-10+ API calls per task, so understanding costs and optimization is crucial.

Getting Your Claude API Key

Step 1: Create Anthropic Account

  1. Visit console.anthropic.com
  2. Sign up with email or Google/GitHub
  3. Verify your email address

Step 2: Add Payment Method

Anthropic requires a payment method for API access:

  1. Navigate to Account Settings → Billing
  2. Add credit card or PayPal
  3. Set up budget alerts (recommended: $50/month to start)

Step 3: Generate API Key

  1. Go to Account Settings → API Keys
  2. Click "Create Key"
  3. Give it a descriptive name (e.g., "OpenClaw Production")
  4. Copy the key immediately—you won't see it again
  5. Store securely (password manager, environment variable)

Important: Treat API keys like passwords. Never commit them to Git or share publicly.

Step 4: Set Usage Limits

Prevent runaway costs:

  1. Go to Account Settings → Usage Limits
  2. Set monthly budget cap (e.g., $100)
  3. Enable email notifications at 50%, 75%, 90%

Configuring OpenClaw with Claude API

Basic Configuration

# Set API key
openclaw config set claude.apiKey "sk-ant-api03-..."

# Set default model
openclaw config set claude.defaultModel "claude-sonnet-4"

# Verify configuration
openclaw config get claude

Output:

claude:
  apiKey: "sk-ant-api03-***" (hidden)
  defaultModel: "claude-sonnet-4"
  maxTokens: 8192
  temperature: 1.0

Advanced Configuration

Edit ~/.openclaw/config.yaml directly for fine-grained control:

claude:
  # API credentials
  apiKey: "sk-ant-api03-your-key-here"
  
  # Model defaults
  defaultModel: "claude-sonnet-4"
  mainSessionModel: "claude-opus-4"      # Use Opus for main chat
  subagentModel: "claude-sonnet-4"       # Sonnet for sub-agents
  cronModel: "claude-haiku-4"            # Haiku for scheduled tasks
  
  # Generation parameters
  maxTokens: 8192                        # Max output per response
  temperature: 1.0                       # Creativity (0.0-1.0)
  topP: 0.95                             # Nucleus sampling
  
  # Rate limiting
  requestsPerMinute: 50                  # Max API calls/minute
  concurrentRequests: 5                  # Max parallel requests
  
  # Retry behavior
  maxRetries: 3                          # Retry failed requests
  retryDelayMs: 1000                     # Initial retry delay
  retryBackoff: 2.0                      # Exponential backoff multiplier
  
  # Cost controls
  maxCostPerHour: 5.0                    # Halt if exceeds $5/hour
  warnThreshold: 0.5                     # Warn at $0.50 per request
  
  # Extended thinking
  thinkingEnabled: true                  # Enable reasoning mode
  thinkingLevel: "low"                   # low, medium, high
  budgetTokens: 10000                    # Tokens allocated for thinking

After editing, restart Gateway:

openclaw gateway restart

Model Selection Strategies

Different tasks need different models. Here's how to optimize:

Per-Session Model Selection

# Main human conversation: Best quality
mainSessionModel: "claude-opus-4"

# Sub-agents (isolated tasks): Balanced
subagentModel: "claude-sonnet-4"

# Cron jobs (autonomous tasks): Cost-effective
cronModel: "claude-haiku-4"

Per-Task Model Selection

In cron job configs:

cron:
  - id: simple-check
    schedule: "*/15 * * * *"
    task: "Check for new emails"
    model: "claude-haiku-4"           # Simple = Haiku
  
  - id: content-generation
    schedule: "0 9 * * *"
    task: "Write daily blog post"
    model: "claude-sonnet-4"          # Moderate = Sonnet
  
  - id: strategic-analysis
    schedule: "0 9 * * 1"
    task: "Competitive analysis report"
    model: "claude-opus-4"            # Complex = Opus

Dynamic Model Selection

For ultimate optimization, use task complexity to choose models:

# In AGENTS.md, add guidelines:
## Model Selection

- Use Haiku for: checking status, simple queries, routine confirmations
- Use Sonnet for: writing content, coding, moderate analysis
- Use Opus for: strategic decisions, complex coding, creative work

When in doubt, start with Sonnet. Upgrade to Opus only if quality is insufficient.

Cost Optimization

Claude API costs add up quickly if you're not careful. Here's how to optimize:

1. Choose the Right Model

Example: Daily blog post (1000 words)

Using Opus 4:

  • Input: ~10k tokens (context) × $15 = $0.15
  • Output: ~2k tokens × $75 = $0.15
  • Total: $0.30 per post

Using Sonnet 4:

  • Input: ~10k tokens × $3 = $0.03
  • Output: ~2k tokens × $15 = $0.03
  • Total: $0.06 per post

5x cost savings for marginally lower quality on routine content.

2. Reduce Context Size

Claude charges for every token sent:

# Before optimization: Loading everything
- SOUL.md: 500 tokens
- USER.md: 300 tokens
- AGENTS.md: 2000 tokens
- MEMORY.md: 5000 tokens
- Daily logs: 3000 tokens
- Tool definitions: 4000 tokens
Total: 14,800 tokens per request

# After optimization: Trim the fat
- SOUL.md: 200 tokens (concise)
- USER.md: 150 tokens (essential only)
- AGENTS.md: 1000 tokens (core rules)
- MEMORY.md: 2000 tokens (curated)
- Daily logs: 1000 tokens (today only)
- Tool definitions: 2000 tokens (relevant only)
Total: 6,350 tokens per request

Cost savings: 57% reduction in input tokens

3. Cache Repeated Context

Anthropic's API supports prompt caching (if available):

claude:
  enableCaching: true
  cacheSystemPrompt: true            # Cache SOUL.md, AGENTS.md
  cacheTTL: 3600                     # Cache for 1 hour

This can reduce costs by 50-90% for repeated context.

4. Batch Operations

Instead of 10 separate API calls:

# Inefficient: 10 calls
1. "Write post about AI"
2. "Write post about Claude"
3. "Write post about automation"
[etc...]

# Efficient: 1 call
"Write 10 blog posts on these topics: [list]. Format as JSON array with title, content, description for each."

Reduces overhead and API call costs.

5. Use Streaming for Long Responses

Enable streaming to start processing output before completion:

claude:
  streaming: true

This doesn't reduce costs but improves perceived speed and allows early termination if output is going wrong.

Rate Limiting and Quotas

Anthropic enforces rate limits to prevent abuse:

Default Limits (Tier 1)

  • Requests per minute: 50
  • Tokens per minute: 40,000
  • Tokens per day: 1,000,000

Handling Rate Limits

OpenClaw automatically retries with exponential backoff:

claude:
  maxRetries: 3
  retryDelayMs: 1000                 # Start with 1 second
  retryBackoff: 2.0                  # Double each retry (1s, 2s, 4s)

If you hit rate limits frequently:

  1. Request tier increase: Email Anthropic support with usage details
  2. Reduce concurrency: Lower concurrentRequests in config
  3. Batch requests: Combine multiple tasks into single API calls
  4. Add delays: Space out cron jobs and scheduled tasks

Monitoring Usage

Check current usage:

# Via Anthropic console
# Visit console.anthropic.com → Usage

# Via OpenClaw logs
openclaw gateway logs | grep "tokens used"

# Set up cost alerts
openclaw config set claude.maxCostPerHour 5.0

Advanced Configuration

Extended Thinking Mode

Enable Claude's chain-of-thought reasoning:

claude:
  thinkingEnabled: true
  thinkingLevel: "medium"            # low, medium, high
  budgetTokens: 10000                # Max tokens for thinking

This makes Claude "think out loud" before responding, improving quality but increasing token usage (~20-40% more tokens).

Use for: Complex coding, strategic decisions, tricky reasoning problems Don't use for: Simple queries, routine tasks, real-time chat

System Prompt Customization

Override the default system prompt:

claude:
  systemPromptOverride: |
    You are ClawBot, a specialized AI assistant for DevOps automation.
    Focus on: infrastructure, deployment, monitoring, security.
    Be concise—output should be actionable commands, not explanations.

Warning: Overriding system prompt disables OpenClaw's default instructions. Only do this if you know what you're doing.

Response Format Control

Force JSON output for structured data:

claude:
  responseFormat: "json"             # Always return valid JSON

Useful for cron jobs that parse output programmatically.

Temperature and Sampling

Control randomness:

claude:
  temperature: 0.7                   # Lower = more focused
  topP: 0.9                          # Lower = less random
  topK: 50                           # Limit candidate tokens
  • Temperature 0.0: Completely deterministic (use for code, data extraction)
  • Temperature 1.0: Full creativity (use for writing, brainstorming)
  • TopP 0.9: Standard nucleus sampling (good default)

Troubleshooting

Error: "Invalid API key"

# Check key is set correctly
openclaw config get claude.apiKey

# Verify key format (should start with "sk-ant-api03-")
# If wrong, set again:
openclaw config set claude.apiKey "sk-ant-api03-your-key"

# Restart Gateway
openclaw gateway restart

Error: "Rate limit exceeded"

# Check current usage
# Visit console.anthropic.com → Usage

# Reduce concurrency
openclaw config set claude.concurrentRequests 2

# Add retry delays
openclaw config set claude.retryDelayMs 2000

# Request tier increase from Anthropic

Error: "Context too long"

# Check context size
openclaw chat --debug
# Look for "tokens sent: XXXX"

# If over 150k, trim memory files:
# - Reduce MEMORY.md to essential info only
# - Archive old daily logs
# - Shorten AGENTS.md rules

High API Costs

# Check usage breakdown
# console.anthropic.com → Usage → Detailed

# Identify expensive operations:
openclaw gateway logs | grep "cost:" | sort -n

# Switch models for high-volume tasks:
# Change cron jobs to Haiku
# Use Sonnet instead of Opus for routine work

Slow Responses

# Enable streaming for faster perceived speed
openclaw config set claude.streaming true

# Use faster model
openclaw config set claude.defaultModel "claude-haiku-4"

# Reduce maxTokens (limits output length)
openclaw config set claude.maxTokens 4096

Best Practices

1. Start Conservative

# Day 1 configuration
defaultModel: "claude-sonnet-4"      # Balanced choice
maxCostPerHour: 2.0                  # Low limit while learning
requestsPerMinute: 20                # Conservative rate

Increase limits as you understand usage patterns.

2. Monitor Regularly

Set up weekly check:

cron:
  - id: usage-report
    schedule: "0 9 * * 1"            # Monday 9 AM
    task: |
      Check Claude API usage for past week.
      Report total cost, requests, tokens used.
      Flag any unusual spikes.
    channel: email

3. Use Appropriate Models

# This is wasteful:
cronModel: "claude-opus-4"           # Opus for simple cron tasks

# This is smart:
mainSessionModel: "claude-opus-4"    # Opus where quality matters
cronModel: "claude-haiku-4"          # Haiku for automation

4. Implement Circuit Breakers

claude:
  maxCostPerHour: 10.0               # Halt if costs spike
  maxConsecutiveErrors: 5            # Stop if API failing repeatedly

Prevents runaway costs from bugs or API issues.

5. Version Your Configuration

# Keep config in git
cd ~/.openclaw
git init
git add config.yaml
git commit -m "Initial Claude API config"

Allows rollback if changes cause issues.

Security Best Practices

Never Hardcode API Keys

# Bad: API key in config file
claude:
  apiKey: "sk-ant-api03-abc123..."

# Good: API key in environment variable
claude:
  apiKey: "${CLAUDE_API_KEY}"

# Set in shell profile
echo 'export CLAUDE_API_KEY="sk-ant-api03-..."' >> ~/.bashrc

Rotate Keys Regularly

# Every 90 days:
1. Generate new key in Anthropic console
2. Update OpenClaw config
3. Restart Gateway
4. Delete old key from Anthropic console

Limit Key Permissions

If Anthropic supports scoped keys (check their docs):

  • Create read-only keys for monitoring tools
  • Create limited-scope keys for specific applications
  • Use full-access keys only for OpenClaw

Conclusion

Claude API is the engine that powers OpenClaw's intelligence. By understanding model selection, cost optimization, rate limiting, and configuration options, you can build a production-ready AI assistant that's both powerful and cost-effective.

Key takeaways:

  • Start with Sonnet 4 for balanced performance
  • Use Haiku for high-volume automation
  • Reserve Opus for complex reasoning tasks
  • Monitor costs and set budget alerts
  • Optimize context size to reduce token usage
  • Implement rate limiting and error handling

Ready to dive deeper? Check out our OpenClaw memory system guide and cron automation tutorial to build sophisticated autonomous workflows.

The combination of Claude's intelligence and OpenClaw's automation creates something greater than the sum of its parts—an AI assistant that truly works for you. 🤖💰