Running Claude Code with a Different Model via LiteLLM Proxy

Claude Code is Anthropic’s official CLI tool for interacting with Claude models directly from your terminal. By default, it connects to Anthropic’s API, but sometimes you may want to use a different LLM provider — for cost savings, privacy, or to experiment with alternative models.

In this guide, I’ll show how to redirect Claude Code requests through a LiteLLM proxy, allowing you to swap in any compatible model while keeping the Claude Code interface you’re familiar with.

Why Use a Different Model?

There are several reasons you might want to route Claude Code through an alternative provider:

Cost optimization — some providers offer competitive models at lower prices
Regional availability — access models from providers available in your region
Experimentation — test how different LLMs handle coding tasks within the Claude Code workflow
Corporate restrictions — your organization may require using a specific API endpoint

Prerequisites

Before starting, make sure you have:

Claude Code installed (npm install -g @anthropic-ai/claude-code)
Python 3.8+ with pip
LiteLLM installed (pip install litellm)
An API key from your target LLM provider
screen or tmux for running the proxy in the background (optional but recommended)

How It Works

The idea is simple: LiteLLM acts as a local proxy server that translates Anthropic API calls into the format required by your target provider. Claude Code thinks it’s talking to the Anthropic API, but the requests are actually routed to whichever model you configure.

Claude Code  →  LiteLLM Proxy (localhost:4000)  →  Target LLM Provider API

Step 0: Get an API Key from Cloud.ru

If you’re following this guide with Cloud.ru Foundation Models, you’ll need to create an API key first.

Register and access the console

Go to cloud.ru and create an account (or log in if you already have one)
Once inside the console, click the nine-dot menu icon in the upper left corner
Confirm that Foundation Models appears under the AI Factory section

Create a Service Account

API keys in Cloud.ru are tied to service accounts, not to your personal account:

Navigate to Users → Service Accounts
Click “Create Account” in the upper right corner
Enter a name (e.g., litellm-proxy) and an optional description
Assign the “Project Administrator” role
Click “Create”

Generate an API Key

Go back to Users → Service Accounts
Click on the service account you just created
In the “Access Credentials” section, click “Create API Key”
Fill in the parameters:
- Services — select Foundation Models
- Validity Period — set between 1 day and 1 year (90 days is a good default)
Click “Create”

Important: Save the Key Secret immediately — after closing the window, you won’t be able to retrieve it again. This is the value you’ll use as OPENAI_API_KEY in the next step.

The created API key will appear in the list with an “Active” status. You’re ready to proceed.

Step 1: Start the LiteLLM Proxy

First, start a LiteLLM proxy server that will translate requests to your target model. I recommend running it in a screen session so it persists in the background.

# Create a new screen session for the proxy
screen -S proxy

# Set your provider's API key
export OPENAI_API_KEY="your-api-key-here"

# Start LiteLLM proxy
litellm \
  --model openai/zai-org/GLM-4.7 \
  --api_base https://foundation-models.api.cloud.ru/v1 \
  --drop_params \
  --alias "claude-3-5-sonnet-latest:openai/zai-org/GLM-4.7"

Key flags explained:

Flag	Description
`--model`	The target model in `provider/model-name` format
`--api_base`	The base URL of your target provider’s API
`--drop_params`	Drops unsupported parameters silently instead of throwing errors (important since Claude Code sends Anthropic-specific params)
`--alias`	Maps the model name Claude Code requests to your actual target model

Once the proxy is running, detach from the screen session:

Press Ctrl+A, then D to detach
To reattach later: screen -r proxy
To list sessions: screen -ls

Step 2: Configure Claude Code

Create or edit the Claude Code project settings file to point at your local proxy:

nano .claude/settings.json

Add the following configuration:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://0.0.0.0:4000",
    "ANTHROPIC_AUTH_TOKEN": "any-placeholder-value",
    "ANTHROPIC_MODEL": "claude-3-5-sonnet-latest"
  }
}

Configuration breakdown:

ANTHROPIC_BASE_URL — points Claude Code to your local LiteLLM proxy instead of the Anthropic API
ANTHROPIC_AUTH_TOKEN — a placeholder token (the real authentication happens between LiteLLM and your provider via the OPENAI_API_KEY env variable)
ANTHROPIC_MODEL — the model name Claude Code will request; this should match the alias you set in LiteLLM

Note: The .claude/settings.json file is project-scoped. If you want this configuration to apply globally, place it in ~/.claude/settings.json instead.

Step 3: Launch Claude Code

Now start Claude Code as usual:

claude

Claude Code will send requests to localhost:4000, where LiteLLM will translate them and forward to your configured provider.

Example: Using Cloud.ru GLM-4.7

In this guide, I used Cloud.ru Foundation Models API with the GLM-4.7 model as an example. The setup connects Claude Code to a Chinese-developed model hosted on Russian cloud infrastructure — demonstrating how LiteLLM can bridge completely different API ecosystems.

Troubleshooting

Proxy won’t start:

Check that the port 4000 is not already in use: lsof -i :4000
Verify your API key is correctly set in the environment

Claude Code can’t connect:

Ensure the LiteLLM proxy is running: screen -r proxy
Check that ANTHROPIC_BASE_URL uses http://, not https://

Model errors or unexpected behavior:

The --drop_params flag is essential — without it, Anthropic-specific parameters will cause errors on other providers
Some models may not support all features Claude Code uses (tool use, extended thinking, etc.)
Check LiteLLM logs in the screen session for detailed error messages

Limitations

Keep in mind that alternative models may not fully replicate the Claude experience:

Tool use / function calling — not all models support this the same way
Extended thinking — this is a Claude-specific feature
Context window — different models have different token limits
Code quality — results will vary depending on the model’s coding capabilities

Why Use a Different Model?#

Prerequisites#

How It Works#

Step 0: Get an API Key from Cloud.ru#

Register and access the console#

Create a Service Account#

Generate an API Key#

Step 1: Start the LiteLLM Proxy#

Step 2: Configure Claude Code#

Step 3: Launch Claude Code#

Example: Using Cloud.ru GLM-4.7#

Troubleshooting#

Limitations#