Glossary
Proxy and AI infrastructure terms
Working definitions written for AI engineers who want precise meanings, not marketing paraphrases. Each entry includes the AI-workload context where the term actually matters.
Proxy infrastructure
ASN targeting
ASN targeting is a proxy-routing feature that lets you select exits announced by a specific Autonomous System Number — typically to match a target website's expectation about which ISP its visitors use.
Datacenter proxy
A datacenter proxy routes your request through an IP announced by a cloud provider or data center operator — AWS, GCP, Azure, or similar. Fast and cheap, but the ASN is trivially classifiable as non-consumer.
ISP proxy
An ISP proxy (sometimes "static residential") is a datacenter- hosted IP address that announces under a residential ISP's ASN. It combines datacenter speed and reliability with residential ASN classification.
Residential proxy
A residential proxy routes your request through a real home internet connection — an IP address assigned by an ISP to a consumer subscriber. Targets see the request as coming from a normal household, not a cloud server.
Rotating proxy
A rotating proxy assigns a different exit IP to each outgoing request (or at a fixed time interval), distributing the traffic across a pool. It's the default for bulk scraping and opposite of sticky sessions.
Sticky session
A sticky session is a proxy configuration that keeps the same exit IP assigned to your requests for a set duration. It's the opposite of per-request rotation, and it's what multi-turn workflows need to maintain cookies and session state.
SquadProxy engineering
Common Crawl
Common Crawl is a non-profit that maintains an open repository of crawled web content. Each monthly snapshot captures ~2-3 billion pages, totalling around 250 TB, published in WARC format on AWS S3. It's the primary corpus backbone for most open-weights LLMs.
WARC format
WARC (Web ARChive) is the ISO-standardised file format for storing multiple web resources and their HTTP metadata in one file. Common Crawl and the Internet Archive publish in WARC.
Network detection
HTTP/2 fingerprinting
HTTP/2 fingerprinting is a technique websites use to identify automated traffic by the specific structure of HTTP/2 frames, settings, and priority trees that a client sends. Different clients (browsers, libraries, curl) have distinguishable fingerprints.
TLS fingerprinting (JA3)
JA3 is a method of fingerprinting TLS client implementations based on the specific parameters they send during the TLS handshake. Different clients (browsers, libraries, curl) have distinguishable JA3 signatures.
Ship on a proxy network you can actually call your ops team about
Real ASNs, real edge capacity, and an engineer who answers your Slack the first time.