The hyperscalers won AI compute. The opportunity that now sits on top of voice is still yours to take.
TL;DR
-
Microsoft Teams Phone has 26 million voice users, about 6% of a 350 million+ Teams base.
-
The rest of that base isn’t voice-enabled yet. Switching it on runs through a licensed provider, not a hyperscaler tariff.
-
The hyperscalers won compute. Voice stays regulated, identity-bearing and in-country, and that’s where the provider sits.
- CAMARA and Open Gateway now reach 80% of global mobile connections. Voice AI runs on top of them.
Some people seem to think that the AI story in telecoms is settled. Hyperscalers will spend around $602 billion on AI infrastructure in 2026 and expect, once again, to own the majority of the revenue-driving stack. In this story, CSPs keep their heads above water, layer on AI functions they do not own, find a niche and survive another squeeze, all while faithfully delivering a commodotised but crucial layer.
It’s a convenient story, but when it comes to voice, its not quite holding up. With over a third of AI interactions already being voice-based a good chunk of customer interactions rely on numbers, identity, licensing, a live network and a growing list of critical things that are barriers for anyone outside of telco. It’s these barriers which providers can turn into solid foundations that will keep them ahead of the curve going forward.
Microsoft Teams Phone is a perfect example. Those seats aren’t a line on a hyperscaler tariff; switching them on needs a phone number, an identity layer and a regulator’s licence. Yes, the number of people that need connectivity in their favourite work app is growing, but for seamless voice AI to function on top of this connectivity it needs to work in tandem with this infrastructure. And who better to provide that revenue-capturing functionality than the providers themselves?
The hyperscalers won compute. Voice is a different question.
The compute question has been settled. The models, the generative APIs and the inference layer beneath them belong to the likes of Microsoft, Google and OpenAI, but voice doesn’t belong here, it sits on a different layer.
You can see the same shape in how operators are already deploying AI. AT&T for example runs more than 400 agents and 71 generative AI solutions on Azure OpenAI and the Microsoft Agent Framework for over 100,000 employees. Vodafone’s AI Concierge for SME customers runs on Google’s Gemini platform, live in Greece and Germany. In each case the hyperscaler brought the model, and the operator brought everything the model needs to reach a real phone: the numbers, the identity, the compliance and the live voice network.
The question worth asking is what the operator gets for owning just the voice enablement and what they stand to gain by reaching further into the billing relationship, the brand on the agent, and the ARPU that comes with both.
The voice AI stack, where you already sit and what you should own
The voice AI value chain has four layers and Service Providers own the two closest to the call with a third now up for grabs.
-
At the top sits compute and foundation models, hyperscalers ground.
-
Below that are the voice AI platforms and agent runtimes, where for now at least most providers partner up rather than build.
-
Then comes voice orchestration, identity and network APIs: CAMARA and Open Gateway, STIR/SHAKEN attestation, numbering, lawful intercept, in-country residency.
- And at the foot of the stack is the call itself: SIP, PSTN termination, Operator Connect and Direct Routing.
Those last two layers, orchestration and the call, are the provider’s alone. They’re regulated, locally licensed and need carrier status in every country, no hyperscaler can replicate them at scale by themselves. Every AI agent that picks up a real phone call needs an attested number, a clean connection and in-country compliance underneath it, highly important but no longer driving massive amounts of revenue.
It’s layer number three where CSPs should now look to live inside. You get to own the AI on top of the voice orchestration and the infrastructure but first you need to provide the foundational services that everything can be built upon, natively inside of Microsoft Teams.
Building revenue on Microsoft's platform, on your own ground
Teams is the perfect platform to gain frictionless entry into the 94% of users who are still not voice enabled. Dstny Call2Teams Direct Routing and Call2Teams Operator Connect help you plug your voice directly into the Microsoft tenant. The customer keeps their numbers, their compliance posture and their operator relationship, and start making calls natively in Teams. You get to keep the billing, the brand, the identity attestation, and the building block that you can use to add services that will multiply lifetime value.
FAQ
Can service providers really compete with hyperscalers on voice AI?
On compute and foundation models, no, the hyperscalers lead, and that’s the engine everyone uses. On the voice layer above the call, the structural position favours the provider. The hyperscalers will spend around $602 billion on AI infrastructure in 2026 and then partner with licensed operators to actually reach the phone. That’s where identity, numbering and in-country compliance live, and those are provider-native.
Why is only about 6% of Microsoft Teams voice-enabled today?
Microsoft passed 26 million Teams Phone voice users by the end of 2025 against a 350 million+ Teams base. The gap is wide because voice is harder to switch on than chat or meetings: it needs licensed numbers, in-country compliance and a carrier on the other end of every call. That’s also why it’s a runway rather than a ceiling, each un-voiced user is a seat a provider can activate through the Teams Phone partner model.
How do CAMARA and Open Gateway enable voice AI?
GSMA Open Gateway now reaches 80% of global mobile connections, with CAMARA APIs live across dozens of markets. These APIs, number verify, SIM swap, location, device status, quality-on-demand, give a voice AI agent real network context, such as whether a caller’s SIM was swapped two hours ago. That context lives in the network, where the provider sits.
What ARPU upside can a provider expect from voice AI?
BCG sizes the voice AI market at €10–12 billion by 2029. A provider captures that ARPU when it owns the agent: white-labelled on its brand, billed by it, and running on its own voice orchestration.
How are operators using AI voice agents in 2026?
AT&T runs more than 400 agents and 71 generative AI solutions for over 100,000 employees on Azure OpenAI and the Microsoft Agent Framework. Vodafone’s AI Concierge runs on Google’s Gemini platform, live in Greece and Germany. KPN is piloting voice-to-voice agentic AI to cut average handling time. In every case the hyperscaler brought the model, and the operator brought the voice the model runs on, far more than just terminating the call.
Dstny is an Always-On Communications platform built for Service Providers across Europe. European by Design: voice data processed within EU jurisdiction, by a European company, governed by European law. Want to know how Dstny closes the voice sovereignty gap for your enterprise customers? Dstny Call2Teams