
LiveKit hits $1B valuation as voice AI accelerates
LiveKit’s $100M Series C values it at $1B, fueling real-time voice AI infrastructure behind ChatGPT-style agents as enterprises scale voice use.
TL;DR
LiveKit, the real-time voice/video infrastructure behind ChatGPT Voice Mode, raised $100M in a Series C led by Index Ventures, valuing it at $1B. Used by teams like Tesla and Salesforce, it’s betting 2026 is when voice agents go mainstream as enterprises shift from pilots to always-on calls—forcing builders to nail latency, reliability, and monitoring.
LiveKit becomes a $1B voice-AI infrastructure name
LiveKit has announced a $100 million Series C round that values the company at $1 billion, with Index Ventures leading and Salesforce Ventures and Hanabi Capital joining alongside existing backers Altimeter and Redpoint. This “unicorn moment” matters beyond fundraising headlines because LiveKit sits in the infrastructure layer that makes real-time voice and video experiences feel immediate, stable, and natural for end users.
The company’s momentum is tightly connected to the surge in voice AI, and LiveKit says it is building the core stack required for this new style of computing—one where conversations become the interface rather than clicks, forms, and menus. In the broader ecosystem, LiveKit is also widely associated with OpenAI’s ChatGPT Voice Mode work through its agents framework, which it describes as modeled from that effort and distributed at large scale via open-source downloads.
From the perspective of the ai world organisation, the funding announcement is a strong signal that “voice-first” is moving from demos into durable enterprise roadmaps, which is exactly the kind of shift tracked across the ai world organisation events calendar and discussed on stages at the ai world summit. This is also why the ai world summit 2025 / 2026 conversation is increasingly focused on real-time AI systems (latency, reliability, orchestration, and monitoring) rather than only model capability benchmarks.
Why voice AI is moving into production now
LiveKit argues that voice is the most natural interface people have, and that the industry has reached a point where interacting with computers through speech is becoming practical at scale. In its own timeline, the company notes that since its Series B announcement in April 2025, voice AI expanded from being a feature users noticed inside ChatGPT into thousands of applications spanning financial services, healthcare, retail, customer support, education, and robotics.
The most important transition is who is adopting the technology: LiveKit says large enterprises are actively evaluating and building voice agents to automate workflows, improve customer experiences, and open up new revenue opportunities. While it acknowledges many deployments are still proof-of-concept, it points to examples it describes as operating at real scale, including Salesforce’s Agentforce voice agents for customer support and Tesla using voice AI across sales, support, insurance, and roadside assistance.
LiveKit’s forward-looking claim is clear: it expects 2026 to be the year voice AI is broadly deployed across thousands of use cases worldwide. For the ai world organisation, that expectation lines up with why “production voice” is becoming a core track inside ai conferences by ai world—because the winners in voice won’t just be those with the smartest model output, but those with dependable real-time systems, safe workflows, and measurable business outcomes.
A new computing stack: realtime + stateful
One of the strongest ideas in LiveKit’s announcement is that voice AI apps behave differently from traditional web apps, which were built around HTTP and short, stateless requests that can be handled independently. LiveKit’s view is that voice breaks that assumption because a spoken conversation can run for minutes—or even hours—while the agent continuously listens, thinks, responds, and maintains context across the entire session.
That single difference changes almost everything “under the hood”: LiveKit says you can’t build, test, deploy, run, or monitor voice AI apps the same way you do conventional web services, because the full stack must be rebuilt for realtime and stateful interactions with “human-native” interfaces. In other words, voice AI is not simply an add-on UI layer; it is an architectural shift where reliability depends on orchestration, networking, and observability as much as it depends on the LLM.
This framing is especially relevant for the ai world summit audience because many enterprise leaders are asking the same practical questions: how to keep conversations smooth under load, how to prevent latency spikes, how to handle interruption and turn-taking, and how to ensure the system stays safe and compliant while acting like a live representative. LiveKit’s announcement effectively reframes “voice AI” as a systems problem, which makes it a perfect fit for deep-dive sessions and case studies at ai world organisation events.
Building, testing, deploying, and observing voice agents
LiveKit describes an “agent development lifecycle” with stages that look familiar—build, test/evaluate, deploy/run, and observe—but it emphasizes that each stage needs voice-specific tooling. On the build side, it positions voice agents as having both frontend and backend requirements, and says it offers client SDKs across platforms plus an agents framework that provides programmatic orchestration, model integrations, and conversational dynamics like turn detection and interruptions. It also highlights a path for faster experimentation by pointing to its Agent Builder for teams that want to start from templates and iterate without beginning entirely in code.
For testing and evaluation, LiveKit stresses that AI outputs are stochastic, so traditional deterministic assertions aren’t enough, and systems should be tested statistically in a way that resembles evaluating human performance. It says LiveKit Agents supports unit tests and can be connected with OpenTelemetry traces, and it references simulation-style testing—running many conversations across different prompts, languages, and voice settings—via partnerships with Bluejay, Hamming, and Roark.
Where voice becomes especially demanding is deploy and run: LiveKit argues that voice agents face unpredictable session lengths and sudden spikes in concurrent demand, so capacity planning, connection management, load balancing, and failover require a different mindset than classic web apps. It says it launched serverless agents to make deployment more turnkey, and it describes building a global network of data centers optimized for routing voice and video with low latency, claiming that the network handles billions of calls per year across web, mobile, and phone-call experiences.
It also points to telephony partnerships designed to link directly to the PSTN, positioned as a way to reduce latency for phone-based voice agent conversations. Once a call is live, LiveKit notes that multiple models can be chained in each conversational turn—speech-to-text, turn/interrupt detection, the LLM, and text-to-speech—and it flags a real operational risk: model providers run in different regions, can backlog, and can experience outages that make conversations choppy and unreliable. To address that orchestration problem, it presents LiveKit Inference as an abstraction that routes inference across model providers and also mentions hosting models across its own data centers so inference can be colocated with deployed agents.
Finally, on observability, LiveKit says voice agents historically lacked a “Datadog-equivalent” that lets teams understand what happened on a live call, how the agent interpreted inputs, how long responses took, and whether the agent followed the intended workflow. It argues that session replays, traces, time-aligned transcripts, and error logs become essential feedback loops so developers can improve agent behavior and iterate.
From the ai world organisation standpoint, this lifecycle framing is highly actionable for enterprise leaders attending the ai world summit because it breaks “voice AI” into operational checkpoints that can be governed, measured, and improved—exactly the kind of maturity model that makes production deployments safer and more successful. That is also why ai conferences by ai world are increasingly expected to host not just visionary keynotes but practitioner sessions on reliability, evaluation, and observability.
What this signals for AI World Summit 2025/2026
LiveKit positions voice as one of the biggest paradigm shifts in computing and suggests the earliest large-scale adoption will happen in places where voice is already the interface—phone calls, cars, and smart speakers—before voice-native apps become more default as models improve in turn-taking, tool use, and reliability. It describes its mission as building the development stack and runtime between foundation models and end-user applications, with the goal of making voice AI as easy to build and scale as the web.
For the ai world organisation, this is an important narrative to carry into the ai world summit programming because it connects three themes that business leaders care about: customer experience, operational efficiency, and new revenue opportunities—all delivered through a conversational interface. The practical implication for ai world organisation events is that “voice” should be treated as a full product and infrastructure track, not a minor feature discussion, especially as 2026 deployment expectations rise.
If the ai world summit 2025 / 2026 agenda is built around what is truly changing inside enterprises, then LiveKit’s Series C story fits neatly: it highlights that the next wave of AI adoption will be shaped as much by real-time infrastructure, testing discipline, and observability as by model intelligence. In that context, the ai world organisation can position upcoming sessions and networking as the place where builders, operators, and business owners align on how to ship voice agents responsibly—making the ai world summit and ai conferences by ai world relevant to both technology and leadership audiences.