Microsoft’s Azure Communication Services now supports bidirectional audio streaming with the new Voice Live API, enabling developers to create next-gen AI voice agents. These agents offer real-time, natural conversations with multilingual support, noise suppression, custom branded voices, and enhanced security for seamless customer interactions. Unique :

Create Next-Gen Voice Agents with Azure AI Voice Live API
Microsoft just dropped a game-changing update for developers building voice agents. At Microsoft Build 2025, the General Availability of bidirectional audio streaming for Azure Communication Services Call Automation SDK was announced. This means real-time, natural conversations powered by speech-to-speech AI are now easier to create than ever.
What’s New?
The Call Automation bidirectional streaming API, previously in preview, now supports secure, low-latency audio streaming. It works seamlessly with Azure AI Speech Services’ new Voice Live API (Preview), enabling AI agents to listen and respond in near real-time.
Developers can stream audio from live calls to their servers where Large Language Models (LLMs) analyze and generate voice responses instantly. Plus, Microsoft added JSON Web Token (JWT) authentication to secure websocket connections, ensuring your voice solutions stay safe.
“Creating voice agents has never been easier, delivering seamless, low-latency, and naturally fluent conversations.”
Major Updates and Features
- Multilingual Support: Build agents that chat in 150+ locales with over 600 realistic voices, perfect for global businesses.
- Noise Suppression & Echo Cancellation: Voice Live API includes built-in audio enhancements, ensuring crystal-clear communication.
- Custom Branded Voices: Create unique voice models that reflect your brand’s personality and connect better with customers.
These updates cater to industries like customer service, education, gaming, and public services, where natural voice interaction is a must-have.
Why This Matters for Developers
Integrating Azure Communication Services with the Voice Live API unlocks powerful AI-driven voice experiences. The system supports real-time speech input/output, advanced audio processing, and customizable voices. This combination lets you build virtual agents that feel human, respond quickly, and handle complex conversations effortlessly.
“Integrating these two technologies allows customers to create innovative solutions with multilingual agents and branded voices.”
Getting Started
Microsoft will soon release SDKs, documentation, and samples to help you dive in. Meanwhile, here’s a quick snippet to enable bidirectional streaming:
const mediaStreamingOptions = {
transportUrl: websocketUrl,
transportType: "websocket",
contentType: "audio",
audioChannelType: "unmixed",
startMediaStreaming: true,
enableBidirectional: true,
audioFormat: "Pcm24KMono"
};
And to connect with the Voice Live API (Preview), you can configure the real-time client with noise suppression and echo cancellation for best results.
Final Thoughts
For tech-savvy developers eager to push voice AI boundaries, this release is a big deal. It simplifies building scalable, secure, and natural-sounding voice agents that can transform customer engagement. Stay tuned for the SDK launch and start imagining the future of voice-powered apps.
From the New blog articles in Microsoft Community Hub