Skip to content
/
OpenRouter
© 2026 OpenRouter, Inc

Product

  • Chat
  • Rankings
  • Apps
  • Models
  • Providers
  • Pricing
  • Enterprise
  • Labs

Company

  • About
  • Announcements
  • CareersHiring
  • Privacy
  • Terms of Service
  • Support
  • State of AI
  • Works With OR
  • Data

Developer

  • Documentation
  • API Reference
  • SDK
  • Status

Connect

  • Discord
  • GitHub
  • LinkedIn
  • X
  • YouTube
Favicon for x-ai

xAI: Grok Voice TTS 1.0

x-ai/grok-voice-tts-1.0

Compare

Grok Voice TTS 1.0 is a text-to-speech model from xAI. It converts text into spoken audio across 20+ languages with automatic language detection, and offers five built-in voices (Eve, Ara, Rex, Sal, Leo) covering a range of tones. Inline speech tags allow control over pauses, emphasis, pitch, speed, and vocal style. Output is available in MP3, WAV, PCM, μ-law, and A-law formats at sample rates from 8 kHz to 48 kHz, with up to 15,000 characters per request.

Modalities

Price

$15per 1M characters

Context

15K

Weekly Tokens

24K

Released

May 15, 2026

Overview
Providers
Performance
Pricing
Apps
Activity
Uptime
API

Sample code and API for Grok Voice TTS 1.0

OpenRouter normalizes requests and responses across providers for you.

OpenRouter provides a text-to-speech API that converts text into natural-sounding audio. Send text and a voice selection, and receive raw audio bytes in your chosen format.

The response is a raw audio stream (not JSON). The generation ID is returned in the X-Generation-Id response header for tracking.

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.