OpenClaw
Optimization10 min read

OpenClaw Model Selection and Cost Optimization Guide

Compare AI models for OpenClaw, understand pricing, and learn strategies to optimize costs while maintaining quality responses.

O

OpenClaw Guides

Tutorial Authors

Understanding AI Model Pricing

AI models charge based on tokens - chunks of text that are roughly 4 characters or 0.75 words. You pay for both:

  • Input tokens: The text you send (your message + conversation history)
  • Output tokens: The text the model generates (responses)

Available Models in OpenClaw

OpenClaw supports Anthropic's Claude models and other providers via OpenRouter. Here are the most commonly used models:

Claude Models (Anthropic)

ModelInput PriceOutput PriceBest For
Claude Haiku 4.5$1.00/1M$5.00/1MQuick tasks, high volume
Claude Sonnet 4$3.00/1M$15.00/1MBalanced performance
Claude Sonnet 4.5$3.00/1M$15.00/1MEnhanced reasoning
Claude Opus 4.6$5.00/1M$25.00/1MMost capable, research

Prices as of early 2026. Check Anthropic's pricing page for current rates.

Cost Comparison Example

For a typical conversation with 1,000 input tokens and 500 output tokens:

ModelInput CostOutput CostTotal
Claude Haiku 4.5$0.001$0.0025$0.0035
Claude Sonnet 4$0.003$0.0075$0.0105
Claude Opus 4.6$0.005$0.0125$0.0175

Haiku is 3x cheaper than Sonnet and 5x cheaper than Opus!

Choosing the Right Model

Use Claude Haiku 4.5 When:

  • Responding to simple questions
  • Quick lookups and facts
  • High-volume messaging (WhatsApp, Telegram)
  • Cost is a primary concern
  • Speed matters more than depth

Set it as your default model in ~/.openclaw/openclaw.json:

json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-haiku-4-5"
      }
    }
  }
}

Use Claude Sonnet 4 / 4.5 When:

  • Writing assistance needed
  • Code generation or review
  • Moderate complexity tasks
  • Balance of quality and cost
  • Most everyday use cases
json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4"
      }
    }
  }
}

Use Claude Opus 4.6 When:

  • Complex analysis required
  • Research and deep reasoning
  • Critical business decisions
  • Quality is paramount
  • Cost is not a concern
json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-opus-4-6"
      }
    }
  }
}

Model Routing by Channel

OpenClaw lets you assign different models to different channels using modelByChannel. This is the most effective way to optimize costs — use a cheaper model for casual channels and a more capable one for work:

json
// ~/.openclaw/openclaw.json
{
  "channels": {
    "modelByChannel": {
      "whatsapp": {
        "default": "anthropic/claude-haiku-4-5"
      },
      "telegram": {
        "default": "anthropic/claude-haiku-4-5"
      },
      "discord": {
        "default": "anthropic/claude-sonnet-4"
      },
      "slack": {
        "default": "anthropic/claude-sonnet-4"
      }
    }
  }
}

You can also set up model failover — if the primary model is unavailable, OpenClaw will try the fallback models in order:

json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4",
        "fallbacks": [
          "anthropic/claude-haiku-4-5"
        ]
      }
    }
  }
}

Community Auto-Routers

For even smarter routing based on message complexity, community projects like iblai-openclaw-router and ClawRoute can automatically route simple queries to Haiku and complex ones to Sonnet or Opus.

Cost Optimization Strategies

1. Limit Conversation History

Conversation history adds up fast. Limit how much is sent using historyLimit:

json
// ~/.openclaw/openclaw.json
{
  "messages": {
    "groupChat": {
      "historyLimit": 10
    }
  },
  "channels": {
    "whatsapp": {
      "dmHistoryLimit": 8
    },
    "telegram": {
      "dmHistoryLimit": 8
    },
    "discord": {
      "historyLimit": 15
    }
  }
}

Keep casual channels (WhatsApp, Telegram) at 8-10 messages and work channels (Discord, Slack) at 15-20. This significantly reduces input tokens per request.

2. Use Anthropic's Prompt Caching

Anthropic offers prompt caching that dramatically reduces costs for repeated context (system prompts, conversation prefixes):

OperationCost Multiplier
Cache write (first request)1.25x base input price
Cache read (subsequent)0.1x base input price

That's a 90% discount on cached content after the first request. If your system prompt is 1,000 tokens and you send 50 messages, you pay full price once and 10% for the other 49.

OpenClaw supports this when your API provider has caching enabled. Check your Anthropic dashboard for "cache read tokens" in your usage.

3. Choose a Cheaper Default Model

The simplest optimization: switch your default from Sonnet to Haiku for most use cases. For 80% of daily interactions (quick Q&A, lookups, reminders, casual conversation), Haiku performs just as well at a fraction of the cost.

Use modelByChannel to keep Sonnet or Opus on specific work channels where quality matters most.

4. Use Model Catalog with Aliases

Define a model catalog to manage multiple models easily and keep track of costs:

json
{
  "agents": {
    "defaults": {
      "models": {
        "anthropic/claude-haiku-4-5": {
          "alias": "fast",
          "cost": { "input": 1.0, "output": 5.0 }
        },
        "anthropic/claude-sonnet-4": {
          "alias": "balanced",
          "cost": { "input": 3.0, "output": 15.0 }
        },
        "anthropic/claude-opus-4-6": {
          "alias": "powerful",
          "cost": { "input": 5.0, "output": 25.0 }
        }
      }
    }
  }
}

5. Use Batch API for Non-Urgent Tasks

If you have non-time-sensitive workloads, Anthropic's Batch API offers a 50% discount on both input and output tokens:

ModelStandard OutputBatch Output
Claude Haiku 4.5$5.00/1M$2.50/1M
Claude Sonnet 4$15.00/1M$7.50/1M
Claude Opus 4.6$25.00/1M$12.50/1M

Monitoring Costs

Real-Time Usage

bash
# View current usage
openclaw stats

# Output:
# Today's Usage:
#   Input tokens:  45,230
#   Output tokens: 12,450
#   Estimated cost: $0.23
#
# This month:
#   Total tokens: 1,234,567
#   Estimated cost: $8.45

Detailed Reports

bash
# Generate cost report
openclaw stats --report monthly

# Export to CSV
openclaw stats --export costs.csv --period 30d

Anthropic Console Limits

Set monthly spend caps directly in the Anthropic Console to prevent surprise bills. This is the most reliable way to enforce a budget — if you hit the cap, the API stops accepting requests.

Cost-Saving Configuration Templates

Budget-Conscious Setup

json
// Optimized for minimal cost
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-haiku-4-5"
      }
    }
  },
  "messages": {
    "groupChat": {
      "historyLimit": 5
    }
  }
}

Estimated cost: ~$5-10/month with moderate use

Balanced Setup

json
// Good balance of quality and cost
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4",
        "fallbacks": ["anthropic/claude-haiku-4-5"]
      }
    }
  },
  "channels": {
    "modelByChannel": {
      "whatsapp": { "default": "anthropic/claude-haiku-4-5" },
      "telegram": { "default": "anthropic/claude-haiku-4-5" }
    }
  },
  "messages": {
    "groupChat": {
      "historyLimit": 15
    }
  }
}

Estimated cost: ~$15-25/month with moderate use

Quality-First Setup

json
// Maximum quality, cost secondary
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4",
        "fallbacks": ["anthropic/claude-opus-4-6"]
      }
    }
  },
  "channels": {
    "modelByChannel": {
      "discord": { "default": "anthropic/claude-opus-4-6" },
      "slack": { "default": "anthropic/claude-opus-4-6" }
    }
  },
  "messages": {
    "groupChat": {
      "historyLimit": 25
    }
  }
}

Estimated cost: ~$30-50/month with moderate use

Multi-Provider Setup (Advanced)

Use multiple providers via OpenRouter to optimize costs:

json
// ~/.openclaw/openclaw.json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4",
        "fallbacks": [
          "openrouter/google/gemini-2.5-flash",
          "anthropic/claude-haiku-4-5"
        ]
      }
    }
  }
}

OpenClaw uses the format provider/model for model IDs. For OpenRouter models, use openrouter/author/slug format.

Tips for Reducing Costs

  1. Be specific in prompts - Vague prompts lead to longer responses
  2. Use system prompts wisely - Keep them concise
  3. Limit history per channel - Casual channels don't need 20 messages of context
  4. Enable prompt caching - 90% savings on repeated context
  5. Monitor usage weekly - Catch unexpected spikes early with openclaw stats
  6. Use Haiku for testing - Switch to Sonnet for production

Calculating Your Expected Costs

Use this formula:

Monthly Cost = (Daily Messages × Avg Input Tokens × Input Price) +
               (Daily Messages × Avg Output Tokens × Output Price) × 30

Example for 50 messages/day with Claude Sonnet 4:

  • Avg input: 800 tokens
  • Avg output: 400 tokens
Input:  50 × 800 × ($3/1M) × 30 = $3.60
Output: 50 × 400 × ($15/1M) × 30 = $9.00
Total: ~$12.60/month

With half your messages routed to Haiku 4.5:

Haiku portion:  25 × 800 × ($1/1M) × 30 + 25 × 400 × ($5/1M) × 30 = $0.60 + $1.50 = $2.10
Sonnet portion: 25 × 800 × ($3/1M) × 30 + 25 × 400 × ($15/1M) × 30 = $1.80 + $4.50 = $6.30
Total: ~$8.40/month

That's a 33% saving just from model routing — and even more with prompt caching on top.

Next Steps