Model Fallback Strategy

Model fallback enables graceful degradation when all accounts exhaust quota for a specific model. The proxy automatically switches to an alternate model with similar capabilities.

What is Model Fallback?

When enabled, the proxy monitors quota exhaustion across all accounts. If every account runs out of quota for a requested model, the proxy automatically falls back to a pre-configured alternate model instead of returning an error.

Fallback is disabled by default. Enable it with the --fallback flag or FALLBACK=true environment variable.

Why Use Fallback?

Continuous Availability

Keep your workflow running even when quota is exhausted for your preferred model.

Automatic Recovery

No manual intervention needed—the proxy handles failover automatically.

Similar Capabilities

Fallback models are chosen to match the original model’s capabilities (thinking, performance tier).

Transparent Logging

The proxy logs when fallback occurs so you can monitor quota usage.

Enabling Fallback

Enable fallback when starting the proxy:

CLI Flag
Environment Variable
Development Mode

npm start -- --fallback

FALLBACK=true npm start

npm run dev -- --fallback

Fallback is disabled on recursive calls to prevent infinite fallback chains.

Fallback Mappings

The proxy uses pre-configured fallback mappings between models. Thinking models fall back to other thinking models to preserve reasoning capabilities.

Claude → Gemini Fallback

claude-opus-4-6-thinking

Falls back to gemini-3.1-pro-highBoth are high-capability thinking models optimized for complex reasoning.

claude-sonnet-4-5-thinking

Falls back to gemini-3-flashBoth are balanced thinking models with good speed and capability.

claude-sonnet-4-5

Falls back to gemini-3-flashFast, general-purpose models without extended thinking.

Gemini → Claude Fallback

gemini-3.1-pro-high

Falls back to claude-opus-4-6-thinkingHigh-performance thinking models for demanding tasks.

gemini-3.1-pro-low

Falls back to claude-sonnet-4-5Balanced models for general coding tasks.

gemini-3-flash

Falls back to claude-sonnet-4-5-thinkingFast thinking models optimized for iteration speed.

Fallback Map Table

Primary Model	Fallback Model
`gemini-3.1-pro-high`	`claude-opus-4-6-thinking`
`gemini-3.1-pro-low`	`claude-sonnet-4-5`
`gemini-3-flash`	`claude-sonnet-4-5-thinking`
`claude-opus-4-6-thinking`	`gemini-3.1-pro-high`
`claude-sonnet-4-5-thinking`	`gemini-3-flash`
`claude-sonnet-4-5`	`gemini-3-flash`

How Fallback Works

Request arrives

A request comes in for a specific model (e.g., claude-opus-4-6-thinking).

Quota check

The proxy checks all accounts for available quota on that model.

Exhaustion detected

If all accounts are exhausted or rate-limited for the requested model:

Without fallback: Return error to client
With fallback: Check if a fallback model exists

Fallback execution

The proxy:

Logs the fallback event
Retrieves the fallback model from the mapping
Retries the request with the fallback model
Returns the response to the client

Example Scenario

Use Cases

Continuous Development

# Your team exhausts Claude Opus quota during peak hours
# Fallback automatically switches to Gemini 3.1 Pro High
# Development continues without interruption

Testing & CI/CD

# Run tests with fallback enabled
FALLBACK=true npm test

# Tests continue even if quota is exhausted
# No failed builds due to rate limits

Production Resilience

Production Deployment

# Enable fallback for high-availability setups
FALLBACK=true PORT=8080 npm start

# Ensure service continuity during quota exhaustion

Monitoring Fallback Events

The proxy logs all fallback events for monitoring and debugging.

Log Output

[INFO] All accounts exhausted for claude-opus-4-6-thinking
[INFO] Falling back to gemini-3.1-pro-high
[SUCCESS] Fallback request completed successfully

Checking Fallback Usage

In the Web Console:

Navigate to Logs
Filter for “fallback” events
Review which models triggered fallback
Monitor quota recovery times

Enable Developer Mode in Settings to see detailed fallback metrics and health scores.

Limitations

Cross-model signature incompatibility

When falling back between Claude and Gemini, thinking signatures may be incompatible and will be dropped automatically.

Claude uses signature field
Gemini uses thoughtSignature field
The proxy detects incompatibility and cleans up signatures
May lose some conversation context during fallback

No recursive fallback

If the fallback model is also exhausted, the proxy returns an error instead of trying another fallback.Example:

Request: claude-opus-4-6-thinking (exhausted)
→ Fallback: gemini-3.1-pro-high (also exhausted)
→ Return: Error (no secondary fallback)

Performance differences

Fallback models have similar but not identical capabilities:

Response quality may vary slightly
Speed/latency characteristics differ
Context window limits may differ

Quota consumption

Fallback uses quota from the alternate model’s pool. If you frequently fall back, you may exhaust both model families.Best practice: Monitor quota usage and add accounts if fallback occurs frequently.

Best Practices

Monitor logs

Track fallback frequency to identify quota bottlenecks.

Add accounts

If fallback occurs regularly, add more Google accounts to increase quota.

Use quota thresholds

Set quota thresholds to switch accounts before exhaustion, reducing fallback needs.

Test both families

Regularly test with both Claude and Gemini to ensure fallback works as expected.

Recommended Configuration

For production use with fallback:

~/.config/antigravity-proxy/config.json

{
  "maxAccounts": 20,
  "globalQuotaThreshold": 0.10,
  "accountSelection": {
    "strategy": "hybrid",
    "quota": {
      "lowThreshold": 0.15,
      "criticalThreshold": 0.05
    }
  }
}

Start the proxy:

FALLBACK=true npm start

Fallback API Reference

The fallback configuration is defined in src/fallback-config.js and src/constants.js.

getFallbackModel()

Get the fallback model for a given primary model:

import { getFallbackModel } from './fallback-config.js';

const fallback = getFallbackModel('claude-opus-4-6-thinking');
// Returns: 'gemini-3.1-pro-high'

const noFallback = getFallbackModel('unknown-model');
// Returns: null

hasFallback()

Check if a model has a fallback configured:

import { hasFallback } from './fallback-config.js';

const hasF = hasFallback('claude-sonnet-4-5-thinking');
// Returns: true

const noF = hasFallback('custom-model');
// Returns: false

MODEL_FALLBACK_MAP

Direct access to the fallback mapping:

import { MODEL_FALLBACK_MAP } from './constants.js';

console.log(MODEL_FALLBACK_MAP);
// {
//   'gemini-3.1-pro-high': 'claude-opus-4-6-thinking',
//   'claude-opus-4-6-thinking': 'gemini-3.1-pro-high',
//   ...
// }

Troubleshooting

Fallback not triggering

Check that:

Fallback is enabled (--fallback or FALLBACK=true)
The requested model has a fallback mapping
All accounts are actually exhausted (check /account-limits)
Developer mode is enabled to see debug logs

Thinking signatures lost after fallback

This is expected behavior when switching between Claude and Gemini.

The proxy automatically cleans up incompatible signatures
Check logs for “signature cleanup” warnings
Use the same model family (all Claude or all Gemini) to preserve signatures

Still getting quota errors with fallback enabled

Possible causes:

Both primary and fallback models are exhausted
Fallback is disabled on recursive calls
Account selection strategy is excluding all accounts

Solution: Add more accounts or wait for quota to reset.

Advanced: Custom Fallback Mappings

You can modify the fallback mappings by editing src/constants.js:

src/constants.js

export const MODEL_FALLBACK_MAP = {
  'gemini-3.1-pro-high': 'claude-opus-4-6-thinking',
  'gemini-3.1-pro-low': 'claude-sonnet-4-5',
  'gemini-3-flash': 'claude-sonnet-4-5-thinking',
  'claude-opus-4-6-thinking': 'gemini-3.1-pro-high',
  'claude-sonnet-4-5-thinking': 'gemini-3-flash',
  'claude-sonnet-4-5': 'gemini-3-flash',
  // Add custom mappings here
  'custom-model-1': 'custom-model-2'
};

Custom mappings require restarting the proxy. Test thoroughly before deploying to production.

Get Started

Guides

Configuration

Integrations

What is Model Fallback?

Why Use Fallback?

Continuous Availability

Automatic Recovery

Similar Capabilities

Transparent Logging

Enabling Fallback

Fallback Mappings

Claude → Gemini Fallback

Gemini → Claude Fallback

Fallback Map Table

How Fallback Works

Example Scenario

Use Cases

Continuous Development

Testing & CI/CD

Production Resilience

Monitoring Fallback Events

Log Output

Checking Fallback Usage

Limitations

Best Practices

Monitor logs

Add accounts

Use quota thresholds

Test both families

Recommended Configuration

Fallback API Reference

getFallbackModel()

hasFallback()

MODEL_FALLBACK_MAP

Troubleshooting

Advanced: Custom Fallback Mappings

Get Started

Guides

Configuration

Integrations

Documentation Index

​What is Model Fallback?

​Why Use Fallback?

Continuous Availability

Automatic Recovery

Similar Capabilities

Transparent Logging

​Enabling Fallback

​Fallback Mappings

​Claude → Gemini Fallback

​Gemini → Claude Fallback

​Fallback Map Table

​How Fallback Works

​Example Scenario

​Use Cases

​Continuous Development

​Testing & CI/CD

​Production Resilience

​Monitoring Fallback Events

​Log Output

​Checking Fallback Usage

​Limitations

​Best Practices

Monitor logs

Add accounts

Use quota thresholds

Test both families

​Recommended Configuration

​Fallback API Reference

​getFallbackModel()

​hasFallback()

​MODEL_FALLBACK_MAP

​Troubleshooting

​Advanced: Custom Fallback Mappings

What is Model Fallback?

Why Use Fallback?

Enabling Fallback

Fallback Mappings

Claude → Gemini Fallback

Gemini → Claude Fallback

Fallback Map Table

How Fallback Works

Example Scenario

Use Cases

Continuous Development

Testing & CI/CD

Production Resilience

Monitoring Fallback Events

Log Output

Checking Fallback Usage

Limitations

Best Practices

Recommended Configuration

Fallback API Reference

getFallbackModel()

hasFallback()

MODEL_FALLBACK_MAP

Troubleshooting

Advanced: Custom Fallback Mappings