Tetrate releases Envoy AI Gateway 1.0 for AI traffic
Wed, 24th Jun 2026 (Today)
Tetrate has released version 1.0 of Envoy AI Gateway, the first production-stable release of the open source project.
Built on the Cloud Native Computing Foundation's Envoy Gateway project, the software is designed to manage AI traffic. Maintainers from Bloomberg, Nutanix and the wider Envoy community contributed to the release.
Envoy AI Gateway grew out of work in the Envoy Gateway community after Bloomberg proposed handling generative AI workloads with the same controls used for conventional internet traffic. The project first appeared in early 2025 and has since been developed through contributions from multiple organisations.
Tetrate said the 1.0 release reflects production use across real workloads. Bloomberg is already running Envoy AI Gateway in production, while Nutanix is taking the software into production and integrating it into Nutanix Agent Gateway and Nutanix Enterprise AI.
Industry backing
The release reflects a broader push by technology companies to build common infrastructure for AI services as businesses seek more control over how models are accessed, monitored and routed. Rather than connecting separately to each model provider, organisations want a single layer between applications and external AI services.
Envoy is already widely used in cloud and internet infrastructure. Created at Lyft in 2015, the proxy has been used in production systems for nearly a decade. Tetrate pointed to its use for AI inference traffic by companies including Databricks, Google, Lyft, Netflix and Spotify.
Tetrate also cited survey data showing that 44% of enterprises either use Envoy in production or are evaluating it for production use. Other organisations named as users or evaluators include AWS, Docker, SAP, Atlassian and LY Corporation.
Version 1.0 introduces a stable v1 application programming interface surface for several core resources, including AIGatewayRoute, AIServiceBackend, BackendSecurityPolicy, GatewayConfig, MCPRoute and MCPRouteSecurityPolicy. The software is intended to give developers and operators a consistent way to define routing, backend services and security policies for AI systems.
New functions
One of the main additions is a unified interface that can sit in front of a range of AI providers. The gateway supports an OpenAI-compatible interface for providers including OpenAI, Anthropic, Google Gemini, Azure OpenAI and AWS Bedrock, as well as other OpenAI-compatible services such as Groq, Together, Mistral, Cohere, DeepSeek and SambaNova.
The release also adds native support for the Model Context Protocol, or MCP, which is becoming a common way for AI agents to connect to tools and external systems. The gateway supports MCP routing, server multiplexing behind a single endpoint, tool filtering, OAuth 2.0 with JWT claim forwarding, fine-grained authorisation using CEL policies and observability at the tool level.
Another feature is token-aware traffic management. This allows rate limiting, budgets and quotas to be set with awareness of the token consumption patterns of AI workloads, including separate cost attribution for input, output, cached and reasoning tokens.
The software also includes centralised management of upstream credentials for different AI providers. According to Tetrate, this covers API key models used by providers such as OpenAI, Anthropic and Azure, as well as Amazon Web Services credential methods.
Observability is another focus of the release. Envoy AI Gateway supports OpenInference distributed tracing and OpenTelemetry GenAI semantic conventions across chat, embeddings, image generation, audio, MCP and reasoning endpoints, according to Tetrate.
Dan Sun, Envoy AI Gateway and KServe Co-founder and Maintainer at Bloomberg, said: "We see the Envoy AI Gateway as a key element toward standardizing how enterprises securely and reliably serve AI workloads. Bloomberg engineers have made hundreds of contributions to this project - in the spirit of our firm's commitment to scalable, open source AI infrastructure that brings vendor neutrality, consistency, observability and control to AI inference at scale."
Varun Talwar, Co-founder and CTO at Tetrate, said: "Envoy has become the foundational layer for internet traffic at the world's most demanding organizations. With v1.0, Envoy AI Gateway brings that same trust to AI workloads. The code in the public repo is the same code running in production at Bloomberg and Tetrate. That level of transparency is rare in open source, and it's what enterprises need as they scale AI."
Nutanix is also contributing bug fixes and development work as it adopts the gateway in its own products.
Debo Dutta, Chief AI Officer at Nutanix, said: "Nutanix is proud to be a maintainer and an active contributor in the Envoy AI Gateway community, helping bring the same transparency and production-grade reliability that powers enterprise Internet traffic to the next wave of AI workloads. We are using the project's capabilities to bring transparent, multiprovider flexibility and production-ready AI infrastructure to our customers."