Skip to content
For Gemini

Proxies for Gemini and the Google AI API

Regional Gemini evaluation across 10 countries, with header-based exit-class routing and the concurrency headroom to run continuous eval fleets.

Updated 23 April 2026

Recommended exit classes

Recommended country anchors

Why proxy Gemini / Google AI API traffic

  1. Regional eval on a provider that routes by geography. Google's inference infrastructure routes APAC traffic through APAC POPs (asia-northeast1, asia-southeast1, etc.) when the client origin is in-region. Measuring whether the response differs from a US-cloud origin vs. an in-region residential is methodology.

  2. Vertex AI multi-region workloads. Teams using Vertex for production inference sometimes want to test how their own pipelines behave when the request appears to originate from a different region than the inference endpoint.

  3. Google-specific content policy. Google applies country-specific content policy (particularly for image generation and content filtering) that shows up in eval from different origins.

  4. Multilingual eval. Gemini handles many languages; pairing each language with its primary origin country (via residentials) gives the authentic eval origin.

What this page isn't

  • Not for circumventing Google's restrictions on Gemini availability.
  • Not for bypassing API quota ceilings on a single project.
  • Not for content generation that violates Google's AUP.

Recommended configuration

import httpx

PROXY = "http://USER:PASS@gateway.squadproxy.com:7777"

def eval_gemini(prompt: str, country: str, model: str = "gemini-2-0-pro"):
    url = f"https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={API_KEY}"
    return httpx.post(
        url,
        json={"contents": [{"parts": [{"text": prompt}]}]},
        headers={
            "X-Squad-Class": "residential",
            "X-Squad-Country": country,
            "X-Squad-Session": "per-request",
        },
        proxies=PROXY,
        timeout=120,
    ).json()

For Vertex AI endpoints (*.aiplatform.googleapis.com), the same proxy shape applies with the Vertex auth headers.

Gemini-specific eval notes

  • Image and multimodal eval — Gemini's image understanding and generation capabilities are evaluated similarly, with the origin anchored to match the test's regional hypothesis.
  • Long-context evaluation — Gemini's long-context windows (1M+ tokens) mean individual calls are heavier. Plan concurrency accordingly.
  • Regional model variants — occasionally Google deploys region-specific variants. Eval from multiple origins surfaces these if they exist.

Plans that fit

See pricing. Team plan covers most continuous Gemini eval work; Lab plan covers larger APAC-focused fleets where the JP / KR / SG origins are primary.

Related

Pricing

Pricing — plans sized for Gemini workloads

Every plan includes access to all 5 exit classes across our 10 focus countries — quotas vary by plan. The size you need scales with your eval cadence and concurrency.

Solo

For individual researchers running evaluation scripts and prototype RAG pipelines.

$149/ month

or $1,430/year (save 20%)

50 GB residential · unlimited datacenter · 200 concurrent sessions

  • Access to all 5 exit classes · 10 focus countries
  • 50 GB residential · unlimited datacenter
  • 5 static ISP IPs · 5 GB 4G mobile
  • 1 seat · 200 concurrent sessions
  • Python + Node SDK + REST API
  • Per-request metering (not time-based)
  • Email support (24h response, business days)
  • Overage: $3/GB residential · $6/GB mobile

Best for

  • Solo researchers
  • Evaluation scripts
  • Prototype RAG

Team

Most popular

For AI startups and mid-size labs splitting capacity between training and evaluation.

$699/ month

or $6,710/year (save 20%)

500 GB residential · unlimited datacenter · 1,000 concurrent sessions

  • Access to all 5 exit classes · 10 focus countries
  • 500 GB residential · unlimited datacenter
  • 25 static ISP IPs · 25 GB 4G mobile
  • 10 seats ($29/mo per extra seat) · 1,000 concurrent sessions
  • City-level geo-routing + ASN targeting
  • 99.9% uptime SLA
  • Priority Slack support (4h response, business hours)
  • Python + Node SDK + REST API + webhooks
  • Overage: $3/GB residential · $6/GB mobile

Best for

  • AI startups
  • Mid-size labs
  • Model eval teams

Lab

For academic labs, eval consortia, and frontier model companies running sustained workloads.

$2,999/ month

or $28,790/year (save 20%)

2 TB residential · unlimited DC · 50 GB 4G + 20 GB 5G · 3,000 concurrent sessions

  • Access to all 5 exit classes · 10 countries on 4 continents
  • 2 TB residential · unlimited datacenter
  • 100 static ISP IPs · 50 GB 4G + 20 GB 5G mobile
  • 50 seats ($19/mo per extra seat) · 3,000 concurrent sessions
  • Dedicated gateway lane (bypasses shared-pool queues on us-east-1 + eu-west-1)
  • 99.95% uptime SLA
  • Dedicated Slack channel (1h response, business hours)
  • Custom BGP prefix on request (additional fees apply)
  • Overage: $2.50/GB residential · $5/GB mobile

Best for

  • Academic labs
  • Large eval consortia
  • Frontier model companies

Enterprise

Custom contracts with dedicated infrastructure, volume pricing, and research-grade SLAs.

Custom pricing

Custom (from 5 TB/mo residential) · unlimited concurrent sessions

  • Volume pricing from 5 TB/mo residential
  • Dedicated BGP prefix + ASN announcement
  • Unlimited concurrent sessions · unlimited seats
  • 99.99% uptime SLA with financial credits
  • Named Technical Account Manager + 24/7 on-call paging
  • Custom AUP, DPA, on-site deployment option
  • Research / academic discount (30–50% off Team or Lab)
  • Annual contract · wire, ACH, USDC/USDT/BTC settlement

Best for

  • Frontier labs
  • Eval consortia
  • Enterprise AI

All plans include 14-day refund, single endpoint with regional failover, HTTP(S) + SOCKS5 on every exit class, access to all 5 exit classes and all 10 focus countries, and Python + Node SDKs. Concurrent sessions = simultaneous TCP sessions through the gateway. Overage warnings fire at 80% and 100%; traffic continues only if overage billing is enabled on your account.

Start routing Gemini traffic through SquadProxy

Real ASNs, real edge capacity, and an engineer who answers your Slack the first time.