podcast — AI — Technology — Startup

The $11B Bet That Voice Will Replace Everything | Mati Staniszewski x Nikhil Kamath | WTF Online

Can voice replace the screen? ElevenLabs co-founder Mati Staniszewski joins Nikhil Kamath on WTF Online to discuss the $11B bet on voice AI, the future of human-computer interaction, and the hardware challenge of making voice agents our primary digital interface.

, and Jax

March 13, 2026 . 1:09 AM

3 min read

As artificial intelligence continues to reshape the landscape of modern technology, one frontier stands out for its potential to fundamentally change human-computer interaction: voice. Mati Staniszewski, co-founder of ElevenLabs, recently joined Nikhil Kamath on the WTF Online podcast to discuss why voice technology is the next major shift in how we navigate the digital world and why it might eventually replace traditional screen-based interfaces.

Key Takeaways

Voice as the Next Interface: Voice is positioned to become the primary way humans interact with technology, moving computing into the background and allowing for more immersive, natural experiences.
The Hardware Challenge: While foundational AI models are maturing rapidly, the "holy grail" remains finding the right form factor—be it smart headphones, pendants, or wearables—to make voice agents ubiquitously useful.
Domain-Specific AI: Entrepreneurs are finding the most success by combining powerful audio models with deep domain expertise in industries like automotive, healthcare, and e-commerce.
Authenticity and Trust: As AI-generated content grows, building platforms that prioritize human verification and authentic interaction will be critical to breaking through the digital noise.

The Shift Toward Voice-Native Experiences

For decades, human-computer interaction has been dominated by keyboards and touchscreens. Staniszewski argues that this is fundamentally unnatural. We are hard-wired to communicate through speech, yet our technology forces us to stop, look down at a screen, and manually input information.

Building the "Jarvis" of Tomorrow

The goal for many in the AI field is to move technology into the background. Imagine a device that understands tone, emotion, and context, providing real-time assistance without the friction of a display. Staniszewski notes that we are approaching "Iron Man" levels of AI, where a voice assistant could potentially manage our schedules, translate foreign languages in real-time, and act as a reliable repository of our personal and professional knowledge.

The most exciting part is: could you have the technology kind of fold into the background, the phone goes back into the pocket, and you kind of immerse yourself in the world around you?

Essential Components for Voice Dominance

To reach the level of mass adoption where voice replaces the smartphone, three distinct barriers must be cleared. First, the foundational technology must reach human-level quality; the AI must understand interruptions, display appropriate intonation, and possess high intelligence. Second, knowledge access is paramount—the assistant must have a memory of past conversations and deep integration with personal or corporate data.

Finally, there is the form factor challenge. While the phone is currently the default, Staniszewski believes we are moving toward a multi-device future. Smart headphones, discrete wearables like pendants, or even future neural interfaces will likely work in concert to provide a seamless, ever-present voice experience.

The Entrepreneurial Opportunity

For aspiring founders, Staniszewski cautions against trying to compete directly with massive foundational model companies. Instead, the greatest value lies in the "agentic layer." This involves taking existing models and applying them to specific, traditional industries that have historically lagged in innovation.

Vertical Integration

The automotive, healthcare, and e-commerce sectors offer fertile ground. By building a voice agent that specifically understands the technical jargon of a hospital or the inventory logistics of a car manufacturer, entrepreneurs can create significant defensive moats that general-purpose models cannot easily replicate. The value is not just in the voice generation itself, but in the integrations and trust built within those specific workflows.

The conversation also pivoted to the current state of social media, which many believe has become trapped in an algorithmic cycle of negativity. Staniszewski and Kamath discussed the possibility of building platforms that prioritize genuine connection and curiosity over knee-jerk emotional reactions.

I don't think there is a place or an ecosystem today where you can have that conversation, so we hope to kind of create it.

The future of social media may not look like the static, text-and-image timelines of today. Instead, it could become an interactive companion where voice plays a central role. By incorporating AI that summarizes interesting developments and facilitates nuanced, multilingual discourse, the next generation of social platforms could foster deeper human connections rather than driving polarization through engagement-baiting algorithms.

Conclusion

The promise of voice technology is not merely about convenience; it is about reclaiming the natural rhythm of human communication. Whether it involves re-imagining how we learn, how we conduct business, or how we relate to one another, the transition toward voice-first interfaces appears inevitable. As foundational models continue to improve, the entrepreneurs who succeed will be those who bridge the gap between abstract AI capabilities and the practical, everyday needs of users, ultimately humanizing the technology we use every day.

Latest

podcast

Iran Crisis Explodes — Bitcoin Doesn’t Care

As geopolitical tensions spike in the Strait of Hormuz, global markets are reeling. Yet, Bitcoin remains defiant, decoupling from traditional assets as institutional accumulation accelerates. Is this the ultimate test for crypto's status as a digital safe haven?

, and Jax

March 14, 2026

Paid Members Public

podcast

Ep. 349: Nader Itayim on Iran’s Regional War, the Hormuz Choke Point, and Global Energy Disruption

In Ep. 349, Nader Itayim analyzes the Iran-backed Strait of Hormuz blockade. Explore how decentralized military tactics, regional volatility, and geopolitical sentiment are driving global energy prices toward a high-risk tipping point.

, and Jax

March 14, 2026

Paid Members Public

podcast

The Week Ahead: $100 Oil – Inflation Scare Today, Rate Cuts Tomorrow?

With oil prices nearing $100, markets face a critical crossroads. We analyze how energy shocks, a fragile labor market, and shifting monetary policy expectations are shaping the investment landscape for the weeks ahead.

, and Jax

March 14, 2026

Paid Members Public

podcast

Scott Galloway Predicts a $10 Trillion Market Wipeout | Pivot

Scott Galloway warns that geopolitical instability and oil market shocks could trigger a $10 trillion global market wipeout. Explore the implications of current leadership, energy supply failures, and the dangerous role of misinformation in today's volatile economy.

, and Jax

March 14, 2026

Paid Members Public

The $11B Bet That Voice Will Replace Everything | Mati Staniszewski x Nikhil Kamath | WTF Online

Table of Contents

Key Takeaways

The Shift Toward Voice-Native Experiences

Building the "Jarvis" of Tomorrow

Essential Components for Voice Dominance

The Entrepreneurial Opportunity

Vertical Integration

Authenticity and the Future of Social Connection

Conclusion

Related

Latest