> ## Documentation Index > Fetch the complete documentation index at: https://mintlify.com/badrisnarayanan/antigravity-claude-proxy/llms.txt > Use this file to discover all available pages before exploring further. # Model Fallback Strategy > Automatic failover to alternate models when quota is exhausted Model fallback enables graceful degradation when all accounts exhaust quota for a specific model. The proxy automatically switches to an alternate model with similar capabilities. ## What is Model Fallback? When enabled, the proxy monitors quota exhaustion across all accounts. If every account runs out of quota for a requested model, the proxy automatically falls back to a pre-configured alternate model instead of returning an error. Fallback is **disabled by default**. Enable it with the `--fallback` flag or `FALLBACK=true` environment variable. ### Why Use Fallback? Keep your workflow running even when quota is exhausted for your preferred model. No manual intervention needed—the proxy handles failover automatically. Fallback models are chosen to match the original model's capabilities (thinking, performance tier). The proxy logs when fallback occurs so you can monitor quota usage. ## Enabling Fallback Enable fallback when starting the proxy: ```bash theme={null} npm start -- --fallback ``` ```bash theme={null} FALLBACK=true npm start ``` ```bash theme={null} npm run dev -- --fallback ``` Fallback is **disabled on recursive calls** to prevent infinite fallback chains. ## Fallback Mappings The proxy uses pre-configured fallback mappings between models. Thinking models fall back to other thinking models to preserve reasoning capabilities. ### Claude → Gemini Fallback Falls back to **gemini-3.1-pro-high** Both are high-capability thinking models optimized for complex reasoning. Falls back to **gemini-3-flash** Both are balanced thinking models with good speed and capability. Falls back to **gemini-3-flash** Fast, general-purpose models without extended thinking. ### Gemini → Claude Fallback Falls back to **claude-opus-4-6-thinking** High-performance thinking models for demanding tasks. Falls back to **claude-sonnet-4-5** Balanced models for general coding tasks. Falls back to **claude-sonnet-4-5-thinking** Fast thinking models optimized for iteration speed. ### Fallback Map Table | Primary Model | Fallback Model | | ---------------------------- | ---------------------------- | | `gemini-3.1-pro-high` | `claude-opus-4-6-thinking` | | `gemini-3.1-pro-low` | `claude-sonnet-4-5` | | `gemini-3-flash` | `claude-sonnet-4-5-thinking` | | `claude-opus-4-6-thinking` | `gemini-3.1-pro-high` | | `claude-sonnet-4-5-thinking` | `gemini-3-flash` | | `claude-sonnet-4-5` | `gemini-3-flash` | ## How Fallback Works A request comes in for a specific model (e.g., `claude-opus-4-6-thinking`). The proxy checks all accounts for available quota on that model. If all accounts are exhausted or rate-limited for the requested model: * **Without fallback**: Return error to client * **With fallback**: Check if a fallback model exists The proxy: 1. Logs the fallback event 2. Retrieves the fallback model from the mapping 3. Retries the request with the fallback model 4. Returns the response to the client ### Example Scenario ```mermaid theme={null} graph LR A[Request: claude-opus-4-6-thinking] --> B{Quota Available?} B -->|Yes| C[Use claude-opus-4-6-thinking] B -->|No| D{Fallback Enabled?} D -->|No| E[Return Error] D -->|Yes| F[Use gemini-3.1-pro-high] F --> G[Return Response] ``` ## Use Cases ### Continuous Development ```bash Scenario: Heavy Claude Usage theme={null} # Your team exhausts Claude Opus quota during peak hours # Fallback automatically switches to Gemini 3.1 Pro High # Development continues without interruption ``` ```bash Scenario: Multi-Region Teams theme={null} # Asia-Pacific team uses up Gemini quota overnight # European team starts work and automatically falls back to Claude # No coordination needed between teams ``` ### Testing & CI/CD ```bash CI Pipeline theme={null} # Run tests with fallback enabled FALLBACK=true npm test # Tests continue even if quota is exhausted # No failed builds due to rate limits ``` ```bash Load Testing theme={null} # Stress test with multiple models FALLBACK=true npm start # Automatically distributes load across model families ``` ### Production Resilience ```bash Production Deployment theme={null} # Enable fallback for high-availability setups FALLBACK=true PORT=8080 npm start # Ensure service continuity during quota exhaustion ``` ## Monitoring Fallback Events The proxy logs all fallback events for monitoring and debugging. ### Log Output ```log theme={null} [INFO] All accounts exhausted for claude-opus-4-6-thinking [INFO] Falling back to gemini-3.1-pro-high [SUCCESS] Fallback request completed successfully ``` ### Checking Fallback Usage In the Web Console: 1. Navigate to **Logs** 2. Filter for "fallback" events 3. Review which models triggered fallback 4. Monitor quota recovery times Enable **Developer Mode** in Settings to see detailed fallback metrics and health scores. ## Limitations When falling back between Claude and Gemini, **thinking signatures may be incompatible** and will be dropped automatically. * Claude uses `signature` field * Gemini uses `thoughtSignature` field * The proxy detects incompatibility and cleans up signatures * May lose some conversation context during fallback If the fallback model is also exhausted, the proxy returns an error instead of trying another fallback. **Example**: ``` Request: claude-opus-4-6-thinking (exhausted) → Fallback: gemini-3.1-pro-high (also exhausted) → Return: Error (no secondary fallback) ``` Fallback models have similar but not identical capabilities: * Response quality may vary slightly * Speed/latency characteristics differ * Context window limits may differ Fallback uses quota from the alternate model's pool. If you frequently fall back, you may exhaust both model families. **Best practice**: Monitor quota usage and add accounts if fallback occurs frequently. ## Best Practices Track fallback frequency to identify quota bottlenecks. If fallback occurs regularly, add more Google accounts to increase quota. Set quota thresholds to switch accounts before exhaustion, reducing fallback needs. Regularly test with both Claude and Gemini to ensure fallback works as expected. ### Recommended Configuration For production use with fallback: ```json ~/.config/antigravity-proxy/config.json theme={null} { "maxAccounts": 20, "globalQuotaThreshold": 0.10, "accountSelection": { "strategy": "hybrid", "quota": { "lowThreshold": 0.15, "criticalThreshold": 0.05 } } } ``` Start the proxy: ```bash theme={null} FALLBACK=true npm start ``` ## Fallback API Reference The fallback configuration is defined in `src/fallback-config.js` and `src/constants.js`. ### getFallbackModel() Get the fallback model for a given primary model: ```javascript theme={null} import { getFallbackModel } from './fallback-config.js'; const fallback = getFallbackModel('claude-opus-4-6-thinking'); // Returns: 'gemini-3.1-pro-high' const noFallback = getFallbackModel('unknown-model'); // Returns: null ``` ### hasFallback() Check if a model has a fallback configured: ```javascript theme={null} import { hasFallback } from './fallback-config.js'; const hasF = hasFallback('claude-sonnet-4-5-thinking'); // Returns: true const noF = hasFallback('custom-model'); // Returns: false ``` ### MODEL\_FALLBACK\_MAP Direct access to the fallback mapping: ```javascript theme={null} import { MODEL_FALLBACK_MAP } from './constants.js'; console.log(MODEL_FALLBACK_MAP); // { // 'gemini-3.1-pro-high': 'claude-opus-4-6-thinking', // 'claude-opus-4-6-thinking': 'gemini-3.1-pro-high', // ... // } ``` ## Troubleshooting Check that: 1. Fallback is enabled (`--fallback` or `FALLBACK=true`) 2. The requested model has a fallback mapping 3. All accounts are actually exhausted (check `/account-limits`) 4. Developer mode is enabled to see debug logs This is expected behavior when switching between Claude and Gemini. * The proxy automatically cleans up incompatible signatures * Check logs for "signature cleanup" warnings * Use the same model family (all Claude or all Gemini) to preserve signatures Possible causes: 1. Both primary and fallback models are exhausted 2. Fallback is disabled on recursive calls 3. Account selection strategy is excluding all accounts **Solution**: Add more accounts or wait for quota to reset. ## Advanced: Custom Fallback Mappings You can modify the fallback mappings by editing `src/constants.js`: ```javascript src/constants.js theme={null} export const MODEL_FALLBACK_MAP = { 'gemini-3.1-pro-high': 'claude-opus-4-6-thinking', 'gemini-3.1-pro-low': 'claude-sonnet-4-5', 'gemini-3-flash': 'claude-sonnet-4-5-thinking', 'claude-opus-4-6-thinking': 'gemini-3.1-pro-high', 'claude-sonnet-4-5-thinking': 'gemini-3-flash', 'claude-sonnet-4-5': 'gemini-3-flash', // Add custom mappings here 'custom-model-1': 'custom-model-2' }; ``` Custom mappings require restarting the proxy. Test thoroughly before deploying to production.