Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/badrisnarayanan/antigravity-claude-proxy/llms.txt

Use this file to discover all available pages before exploring further.

Antigravity Claude Proxy provides access to both Claude and Gemini models through a unified Anthropic-compatible API.

Claude Models

Claude models are accessed through the proxy using Anthropic’s API format. All Claude models support extended thinking capabilities.

claude-opus-4-6-thinking

Most Capable ModelClaude Opus 4.6 with extended thinking capabilities. Best for complex reasoning tasks and challenging problems.
  • Extended thinking output
  • Highest capability tier
  • Best for multi-step reasoning

claude-sonnet-4-5-thinking

Balanced PerformanceClaude Sonnet 4.5 with extended thinking. Excellent balance of speed and capability.
  • Extended thinking output
  • Fast response times
  • Ideal for general coding tasks

claude-sonnet-4-5

Standard ModelClaude Sonnet 4.5 without thinking output. Fastest response times for straightforward tasks.
  • No thinking blocks
  • Maximum speed
  • Best for simple operations

Claude Thinking Models

Claude thinking models include an internal reasoning process before generating their final response:
  • signature field: Claude uses the signature field on thinking blocks for multi-turn conversations
  • Thinking blocks: Extended reasoning is included in the response as separate content blocks
  • Cache support: Thinking signatures are cached for conversation continuity

Gemini Models

Gemini models provide high-performance alternatives with Google’s latest AI technology. All Gemini models version 3+ include thinking capabilities.

gemini-3.1-pro-high

High Performance TierGemini 3.1 Pro High with advanced thinking. Best for demanding workloads.
  • Extended thinking via thoughtSignature
  • High quota allocation
  • Best for production use

gemini-3.1-pro-low

Balanced TierGemini 3.1 Pro Low with thinking support. Good balance of performance and quota.
  • Extended thinking support
  • Moderate quota allocation
  • General purpose use

gemini-3-flash

Fast ResponsesGemini 3 Flash with thinking. Optimized for speed and efficiency.
  • Quick response times
  • Extended thinking included
  • Ideal for rapid iteration

Gemini Thinking Models

Gemini models (version 3 and higher) support thinking capabilities:
  • thoughtSignature field: Gemini uses the thoughtSignature field on functionCall parts
  • Automatic detection: Models with “thinking” in the name or version 3+ are treated as thinking models
  • Signature caching: thoughtSignature values are cached for 2 hours
  • Fallback handling: If Claude Code strips the signature, the proxy restores it from cache or uses a sentinel value

Model Selection in Claude Code

Configure your Claude Code CLI to use specific models:
Edit ~/.claude/settings.json:
{
  "ANTHROPIC_BASE_URL": "http://localhost:8080",
  "ANTHROPIC_AUTH_TOKEN": "test",
  "ANTHROPIC_MODEL": "claude-opus-4-6-thinking",
  "ANTHROPIC_DEFAULT_OPUS_MODEL": "claude-opus-4-6-thinking",
  "ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-5-thinking",
  "ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-sonnet-4-5"
}

Model Naming Conventions

The proxy uses consistent naming patterns for model identification:

Thinking vs Non-Thinking

PatternTypeExample
Contains thinkingThinking modelclaude-sonnet-4-5-thinking
Version 3+ (Gemini)Thinking modelgemini-3-flash
No thinking + version 2.x or olderStandard modelclaude-sonnet-4-5

Model Family Detection

The proxy automatically detects model families:
  • Claude family: Model name contains claude
  • Gemini family: Model name contains gemini
// Example model family detection
getModelFamily('claude-opus-4-6-thinking')  // Returns 'claude'
getModelFamily('gemini-3-flash')            // Returns 'gemini'

Context Window Limits

Gemini models have a maximum output token limit of 16,384 tokens. The proxy automatically enforces this limit.

Quota and Rate Limits

Each model has different quota allocations based on your subscription tier:
  • Ultra tier: Highest quota limits across all models
  • Pro tier: Moderate quota limits
  • Free tier: Basic quota limits
Check your current quota in the Web Console under Models or via the /account-limits API endpoint.

Cross-Model Conversations

When switching between Claude and Gemini models mid-conversation, thinking signatures may be incompatible and will be dropped automatically.
The proxy handles cross-model scenarios:
  1. Signature validation: Checks if cached signatures match the target model family
  2. Automatic cleanup: Drops incompatible signatures when switching families
  3. Recovery mechanism: Injects synthetic messages to close interrupted tool loops

Best Practices

  • Stick to one family: Use either Claude or Gemini for the entire conversation
  • Use fallback strategically: Enable fallback only when needed to maintain signature compatibility
  • Monitor model switches: Check logs for signature cleanup warnings

Testing Models

The proxy includes test utilities for each model family:
# Test Claude models
npm run test:signatures

# Test Gemini models
npm run test:crossmodel

# Test all models
npm test
Default test models:
  • Claude: claude-sonnet-4-5-thinking
  • Gemini: gemini-3-flash