Claude Code is Anthropic’s official CLI tool for interacting with Claude models directly from your terminal. By default, it connects to Anthropic’s API, but sometimes you may want to use a different LLM provider — for cost savings, privacy, or to experiment with alternative models.

In this guide, I’ll show how to redirect Claude Code requests through a LiteLLM proxy, allowing you to swap in any compatible model while keeping the Claude Code interface you’re familiar with.

Why Use a Different Model?

There are several reasons you might want to route Claude Code through an alternative provider:

  • Cost optimization — some providers offer competitive models at lower prices
  • Regional availability — access models from providers available in your region
  • Experimentation — test how different LLMs handle coding tasks within the Claude Code workflow
  • Corporate restrictions — your organization may require using a specific API endpoint

Prerequisites

Before starting, make sure you have:

  • Claude Code installed (npm install -g @anthropic-ai/claude-code)
  • Python 3.8+ with pip
  • LiteLLM installed (pip install litellm)
  • An API key from your target LLM provider
  • screen or tmux for running the proxy in the background (optional but recommended)

How It Works

The idea is simple: LiteLLM acts as a local proxy server that translates Anthropic API calls into the format required by your target provider. Claude Code thinks it’s talking to the Anthropic API, but the requests are actually routed to whichever model you configure.

Claude Code  →  LiteLLM Proxy (localhost:4000)  →  Target LLM Provider API

Step 0: Get an API Key from Cloud.ru

If you’re following this guide with Cloud.ru Foundation Models, you’ll need to create an API key first.

Register and access the console

  1. Go to cloud.ru and create an account (or log in if you already have one)
  2. Once inside the console, click the nine-dot menu icon in the upper left corner
  3. Confirm that Foundation Models appears under the AI Factory section

Create a Service Account

API keys in Cloud.ru are tied to service accounts, not to your personal account:

  1. Navigate to UsersService Accounts
  2. Click “Create Account” in the upper right corner
  3. Enter a name (e.g., litellm-proxy) and an optional description
  4. Assign the “Project Administrator” role
  5. Click “Create”

Generate an API Key

  1. Go back to UsersService Accounts
  2. Click on the service account you just created
  3. In the “Access Credentials” section, click “Create API Key”
  4. Fill in the parameters:
    • Services — select Foundation Models
    • Validity Period — set between 1 day and 1 year (90 days is a good default)
  5. Click “Create”

Important: Save the Key Secret immediately — after closing the window, you won’t be able to retrieve it again. This is the value you’ll use as OPENAI_API_KEY in the next step.

The created API key will appear in the list with an “Active” status. You’re ready to proceed.

Step 1: Start the LiteLLM Proxy

First, start a LiteLLM proxy server that will translate requests to your target model. I recommend running it in a screen session so it persists in the background.

# Create a new screen session for the proxy
screen -S proxy

# Set your provider's API key
export OPENAI_API_KEY="your-api-key-here"

# Start LiteLLM proxy
litellm \
  --model openai/zai-org/GLM-4.7 \
  --api_base https://foundation-models.api.cloud.ru/v1 \
  --drop_params \
  --alias "claude-3-5-sonnet-latest:openai/zai-org/GLM-4.7"

Key flags explained:

FlagDescription
--modelThe target model in provider/model-name format
--api_baseThe base URL of your target provider’s API
--drop_paramsDrops unsupported parameters silently instead of throwing errors (important since Claude Code sends Anthropic-specific params)
--aliasMaps the model name Claude Code requests to your actual target model

Once the proxy is running, detach from the screen session:

  • Press Ctrl+A, then D to detach
  • To reattach later: screen -r proxy
  • To list sessions: screen -ls

Step 2: Configure Claude Code

Create or edit the Claude Code project settings file to point at your local proxy:

nano .claude/settings.json

Add the following configuration:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://0.0.0.0:4000",
    "ANTHROPIC_AUTH_TOKEN": "any-placeholder-value",
    "ANTHROPIC_MODEL": "claude-3-5-sonnet-latest"
  }
}

Configuration breakdown:

  • ANTHROPIC_BASE_URL — points Claude Code to your local LiteLLM proxy instead of the Anthropic API
  • ANTHROPIC_AUTH_TOKEN — a placeholder token (the real authentication happens between LiteLLM and your provider via the OPENAI_API_KEY env variable)
  • ANTHROPIC_MODEL — the model name Claude Code will request; this should match the alias you set in LiteLLM

Note: The .claude/settings.json file is project-scoped. If you want this configuration to apply globally, place it in ~/.claude/settings.json instead.

Step 3: Launch Claude Code

Now start Claude Code as usual:

claude

Claude Code will send requests to localhost:4000, where LiteLLM will translate them and forward to your configured provider.

Example: Using Cloud.ru GLM-4.7

In this guide, I used Cloud.ru Foundation Models API with the GLM-4.7 model as an example. The setup connects Claude Code to a Chinese-developed model hosted on Russian cloud infrastructure — demonstrating how LiteLLM can bridge completely different API ecosystems.

Troubleshooting

Proxy won’t start:

  • Check that the port 4000 is not already in use: lsof -i :4000
  • Verify your API key is correctly set in the environment

Claude Code can’t connect:

  • Ensure the LiteLLM proxy is running: screen -r proxy
  • Check that ANTHROPIC_BASE_URL uses http://, not https://

Model errors or unexpected behavior:

  • The --drop_params flag is essential — without it, Anthropic-specific parameters will cause errors on other providers
  • Some models may not support all features Claude Code uses (tool use, extended thinking, etc.)
  • Check LiteLLM logs in the screen session for detailed error messages

Limitations

Keep in mind that alternative models may not fully replicate the Claude experience:

  • Tool use / function calling — not all models support this the same way
  • Extended thinking — this is a Claude-specific feature
  • Context window — different models have different token limits
  • Code quality — results will vary depending on the model’s coding capabilities