Claude Code is Anthropic’s official CLI tool for interacting with Claude models directly from your terminal. By default, it connects to Anthropic’s API, but sometimes you may want to use a different LLM provider — for cost savings, privacy, or to experiment with alternative models.
In this guide, I’ll show how to redirect Claude Code requests through a LiteLLM proxy, allowing you to swap in any compatible model while keeping the Claude Code interface you’re familiar with.
Why Use a Different Model?
There are several reasons you might want to route Claude Code through an alternative provider:
- Cost optimization — some providers offer competitive models at lower prices
- Regional availability — access models from providers available in your region
- Experimentation — test how different LLMs handle coding tasks within the Claude Code workflow
- Corporate restrictions — your organization may require using a specific API endpoint
Prerequisites
Before starting, make sure you have:
- Claude Code installed (
npm install -g @anthropic-ai/claude-code) - Python 3.8+ with pip
- LiteLLM installed (
pip install litellm) - An API key from your target LLM provider
- screen or tmux for running the proxy in the background (optional but recommended)
How It Works
The idea is simple: LiteLLM acts as a local proxy server that translates Anthropic API calls into the format required by your target provider. Claude Code thinks it’s talking to the Anthropic API, but the requests are actually routed to whichever model you configure.
Claude Code → LiteLLM Proxy (localhost:4000) → Target LLM Provider API
Step 0: Get an API Key from Cloud.ru
If you’re following this guide with Cloud.ru Foundation Models, you’ll need to create an API key first.
Register and access the console
- Go to cloud.ru and create an account (or log in if you already have one)
- Once inside the console, click the nine-dot menu icon in the upper left corner
- Confirm that Foundation Models appears under the AI Factory section
Create a Service Account
API keys in Cloud.ru are tied to service accounts, not to your personal account:
- Navigate to Users → Service Accounts
- Click “Create Account” in the upper right corner
- Enter a name (e.g.,
litellm-proxy) and an optional description - Assign the “Project Administrator” role
- Click “Create”
Generate an API Key
- Go back to Users → Service Accounts
- Click on the service account you just created
- In the “Access Credentials” section, click “Create API Key”
- Fill in the parameters:
- Services — select Foundation Models
- Validity Period — set between 1 day and 1 year (90 days is a good default)
- Click “Create”
Important: Save the Key Secret immediately — after closing the window, you won’t be able to retrieve it again. This is the value you’ll use as
OPENAI_API_KEYin the next step.
The created API key will appear in the list with an “Active” status. You’re ready to proceed.
Step 1: Start the LiteLLM Proxy
First, start a LiteLLM proxy server that will translate requests to your target model. I recommend running it in a screen session so it persists in the background.
# Create a new screen session for the proxy
screen -S proxy
# Set your provider's API key
export OPENAI_API_KEY="your-api-key-here"
# Start LiteLLM proxy
litellm \
--model openai/zai-org/GLM-4.7 \
--api_base https://foundation-models.api.cloud.ru/v1 \
--drop_params \
--alias "claude-3-5-sonnet-latest:openai/zai-org/GLM-4.7"
Key flags explained:
| Flag | Description |
|---|---|
--model | The target model in provider/model-name format |
--api_base | The base URL of your target provider’s API |
--drop_params | Drops unsupported parameters silently instead of throwing errors (important since Claude Code sends Anthropic-specific params) |
--alias | Maps the model name Claude Code requests to your actual target model |
Once the proxy is running, detach from the screen session:
- Press
Ctrl+A, thenDto detach - To reattach later:
screen -r proxy - To list sessions:
screen -ls
Step 2: Configure Claude Code
Create or edit the Claude Code project settings file to point at your local proxy:
nano .claude/settings.json
Add the following configuration:
{
"env": {
"ANTHROPIC_BASE_URL": "http://0.0.0.0:4000",
"ANTHROPIC_AUTH_TOKEN": "any-placeholder-value",
"ANTHROPIC_MODEL": "claude-3-5-sonnet-latest"
}
}
Configuration breakdown:
ANTHROPIC_BASE_URL— points Claude Code to your local LiteLLM proxy instead of the Anthropic APIANTHROPIC_AUTH_TOKEN— a placeholder token (the real authentication happens between LiteLLM and your provider via theOPENAI_API_KEYenv variable)ANTHROPIC_MODEL— the model name Claude Code will request; this should match the alias you set in LiteLLM
Note: The
.claude/settings.jsonfile is project-scoped. If you want this configuration to apply globally, place it in~/.claude/settings.jsoninstead.
Step 3: Launch Claude Code
Now start Claude Code as usual:
claude
Claude Code will send requests to localhost:4000, where LiteLLM will translate them and forward to your configured provider.
Example: Using Cloud.ru GLM-4.7
In this guide, I used Cloud.ru Foundation Models API with the GLM-4.7 model as an example. The setup connects Claude Code to a Chinese-developed model hosted on Russian cloud infrastructure — demonstrating how LiteLLM can bridge completely different API ecosystems.
Troubleshooting
Proxy won’t start:
- Check that the port 4000 is not already in use:
lsof -i :4000 - Verify your API key is correctly set in the environment
Claude Code can’t connect:
- Ensure the LiteLLM proxy is running:
screen -r proxy - Check that
ANTHROPIC_BASE_URLuseshttp://, nothttps://
Model errors or unexpected behavior:
- The
--drop_paramsflag is essential — without it, Anthropic-specific parameters will cause errors on other providers - Some models may not support all features Claude Code uses (tool use, extended thinking, etc.)
- Check LiteLLM logs in the screen session for detailed error messages
Limitations
Keep in mind that alternative models may not fully replicate the Claude experience:
- Tool use / function calling — not all models support this the same way
- Extended thinking — this is a Claude-specific feature
- Context window — different models have different token limits
- Code quality — results will vary depending on the model’s coding capabilities