xAI Grok API

xAI · Ranked #5 of 7 in LLM APIs

77.9/ 100

BStrong

Frontier Grok models with large context and OpenAI/Anthropic-compatible SDKs, self-serve from a single key.

Best for

Frontier models, X-integrated

Visit website Documentation

Overview

xAI's Grok API exposes the same Grok models that power the Grok assistant on X (Twitter) through an OpenAI- and Anthropic-SDK-compatible REST endpoint (api.x.ai). The lineup has evolved rapidly: Grok 4 launched in July 2025 at a premium $3/$15 per million input/output tokens, and by mid-2026 the flagship grok-4.3 sits at $1.25 input / $2.50 output per million (with cached input around $0.20), alongside reasoning, non-reasoning, multi-agent, and a cheaper "grok-build" coding SKU, plus separate Imagine (image/video) and Voice (TTS/STT/realtime) APIs. The standout architectural features are a 1M-token context window on the 4.x models, native tool/function calling, structured outputs, vision, and built-in Live Search / real-time access to X data, which is the genuine differentiator competitors cannot easily replicate.

On raw capability the models are competitive at the frontier: Grok 4 set state-of-the-art on ARC-AGI v2 (15.9%) and posted strong SWE-bench, GPQA (~89%) and Humanity's Last Exam results at launch, and Artificial Analysis still scores Grok 4 around 42 and grok-4.3 (high) at 38 on its Intelligence Index. The trade-off is latency and consistency rather than peak smarts: third-party measurement shows high time-to-first-token on the reasoning variants (14-18s) and moderate output speeds (~40-145 tok/s depending on model/provider), and the cheaper "Fast" variants drop well down the intelligence ranking. The pricing trajectory is aggressively downward, making 4.3 one of the better intelligence-per-dollar options at the frontier.

The main reservations are operational and reputational. The model and pricing catalog churns fast (models get renamed, retired, or moved to legacy/enterprise tiers), which is a real headache for anyone building durable integrations. Developers report confusing/unclear rate limits, occasional sluggishness, and, for the coding-agent use cases, hallucinated API endpoints and method signatures. Reliability is decent but not best-in-class (multi-region status page with a handful of short incidents monthly), and brand/safety controversies around Grok's behavior on X add reputational risk for enterprise buyers. Net: a fast-moving, increasingly price-competitive frontier API with a unique real-time-X-data edge, best suited to teams that value that data access and aggressive pricing over a stable, mature, deeply-documented platform.

How this score is derived

The APIbenchmarks Index is a weighted sum of four dimensions, each scored on an absolute 0–100 reference scale. See the methodology for every mapping.

Dimension	Score	Weight	Contribution
Documentation & DXDocs at docs.x.ai are clean and OpenAI/Anthropic-SDK-compatible with quickstarts, but coverage of newer capabilities (Live Search, structured outputs per-model, rate limits) is uneven and lags the rapid model releases.	80	30%	24.0
ReliabilityMulti-region status page (us-east-1, us-west-2, eu-west-1) shows generally operational service but roughly 4 short incidents in a recent 30-day window (avg ~27 min recovery), and no clearly published uptime SLA for self-serve.	76	25%	19.0
Ecosystem & SDKsStrong reach via OpenAI/Anthropic SDK drop-in compatibility plus availability on OpenRouter, Azure, and tooling like LangChain, though the native first-party SDK story is thinner than OpenAI/Anthropic.	74	25%	18.5
AccessibilitySelf-serve API keys via the xAI console with pay-as-you-go pricing and a data-sharing free-credits path, but unclear rate limits and a churning model catalog raise the friction for new integrators.	82	20%	16.4
APIbenchmarks Index (ABI)			77.9

Table 1. Derivation of the ABI for xAI Grok API. Contribution = score × weight; the index is their sum.

At a glance

Vendor: xAI
Pricing model: Per token
Free tier: No
Official SDKs: 7 languages

Pricing

grok-4.3 (flagship)	$1.25 / $2.50 per 1M	Input / output per million tokens; cached input ~$0.20/1M. 1M-token context.
grok-4.20 reasoning / non-reasoning	$1.25 / $2.50 per 1M	Reasoning and standard variants on the 4.20 series, 1M context.
grok-build-0.1 (coding agent)	$1.00 / $2.00 per 1M	Lower-cost SKU aimed at the Grok Build terminal coding agent, 256k context.
Grok 4 (legacy / launch)	$3.00 / $15.00 per 1M	Original July 2025 flagship pricing; now legacy/enterprise as newer SKUs supersede it.
Grok Imagine API	$0.02–$0.05 / image, $0.05–$0.08 / sec video	Image and video generation pricing.
Grok Voice API	$0.05/min realtime; $15/1M chars TTS; ~$0.10/hr STT	Realtime audio, text-to-speech, and speech-to-text endpoints.

Key features

•1M-token context window (Grok 4.x)
•Live Search / real-time X (Twitter) data access
•Function / tool calling
•Structured outputs (JSON schema)
•Vision / image understanding
•Reasoning and non-reasoning model variants
•Multi-agent model SKU
•Grok Imagine API for image and video generation
•Grok Voice API (realtime audio, TTS, STT)
•OpenAI- and Anthropic-API compatibility

Official SDKs

Python (OpenAI SDK compatible)JavaScript / TypeScript (OpenAI SDK compatible)Anthropic SDK compatibleREST / cURLAvailable via OpenRouterAvailable on Microsoft Azure AI FoundryLangChain integration

Strengths & trade-offs

Strengths

+Built-in Live Search and real-time access to X/Twitter data that competing LLM APIs cannot match
+Large 1M-token context window on the Grok 4.x models
+OpenAI- and Anthropic-SDK compatible, so migration is mostly a base-URL/key swap
+Aggressively falling prices, grok-4.3 at $1.25/$2.50 is strong intelligence-per-dollar at the frontier
+Frontier-level benchmark results (SOTA ARC-AGI v2, strong GPQA/SWE-bench at launch)
+Native tool/function calling, structured outputs, vision, plus separate Imagine and Voice APIs

Trade-offs

–Model and pricing catalog churns fast, models get renamed, retired, or moved to legacy, breaking durable integrations
–High time-to-first-token (~14–18s) on reasoning variants and only moderate output speed
–Developers report unclear/confusing rate limits and occasional sluggish performance
–Coding-agent variants can hallucinate non-existent API endpoints and method signatures
–No clearly published self-serve uptime SLA; a handful of short incidents per month
–Brand and content-safety controversies around Grok on X add enterprise reputational risk

What developers say

Developers praise the speed, real-time X data, and concise code output, but criticize confusing rate limits, fast-churning models/pricing, and coding-agent hallucinations.

“It doesn't over-explain, just gives me the code, used it to refactor a REST API quickly.”

Key figures

Artificial Analysis Intelligence Index (Grok 4)	~42	Artificial Analysis ↗
Artificial Analysis Intelligence Index (grok-4.3 high)	38	Artificial Analysis ↗
Output speed (grok-4.3 high)	139.3 tokens/sec	Artificial Analysis ↗
Time to first token (grok-4.3 high)	18.28 s	Artificial Analysis ↗
ARC-AGI v2 score (Grok 4)	15.9% (SOTA for closed models at launch)	xAI (Grok 4 announcement) ↗
Flagship API price (grok-4.3)	$1.25 in / $2.50 out per 1M tokens	xAI pricing / docs ↗
Recent reliability	~4 incidents / 30 days, ~27 min avg recovery	AIWatch (xAI status tracking) ↗

Compare xAI Grok API head to head

xAI Grok API vs OpenAI API xAI Grok API vs Anthropic Claude API xAI Grok API vs Google Gemini API xAI Grok API vs Mistral La Plateforme xAI Grok API vs Groq xAI Grok API vs DeepSeek API

Sources

Figures last verified 2026-06-27. Spotted an error? corrections@apibenchmarks.com