Context Window
128,000
Tokens
Model Cost Profile
Developer: openai
Pricing updated Mar 11, 2026
OpenAI's GPT-4o Audio model is designed for applications requiring extensive context, with a remarkable 128,000 token context window, making it ideal for generating long-form audio content and transcriptions. Teams leveraging this API model can expect input costs of $2.50 per million tokens and output costs of $10.00 per million tokens, which can significantly impact budget planning for high-volume audio processing projects. This model is particularly beneficial for industries such as media, education, and entertainment, where nuanced understanding and generation of audio data are critical for user engagement and content creation.
Context Window
128,000
Tokens
Input Price / 1M
$2.50
Prompt tokens
Output Price / 1M
$10.00
Completion tokens
Intelligence (MMLU)
Benchmark Pending
Massive Multitask Language Understanding
| Usage Type | Price / 1M Tokens |
|---|---|
| Input (Prompt) | $2.50 |
| Output (Completion) | $10.00 |
Price History
Current Input / 1M
$2.50
Current Output / 1M
$10.00
Estimate monthly spend for OpenAI: GPT-4o Audio based on your workload.
Estimated Monthly Cost
$183
25M input + 12M output tokens
Quick links for cost-down decisions before production rollout.
Common pricing and benchmark questions for OpenAI: GPT-4o Audio.
OpenAI: GPT-4o Audio input pricing is $2.50 per 1M tokens based on the latest synced provider data.
OpenAI: GPT-4o Audio output pricing is $10.00 per 1M tokens based on the latest synced provider data.
OpenAI: GPT-4o Audio supports a context window of 128,000 tokens.
Use the comparison links on this page to open direct model-vs-model pricing and benchmark pages, then evaluate monthly spend projections for your workload.