Voice AI platform Phonic gets backing from Lux

The quality of AI-generated voices is good enough for things like creating audiobooks and podcasts, having articles read aloud to you, and basic customer support. But many businesses don’t think AI voice tech is quite reliable enough to deploy.

That’s why two MIT grads, Moin Nadeem and Nikhil Murthy (pictured above), founded Phonic, a company offering an end-to-end voice stack to increase synthetic voice reliability while decreasing latency.

Nadeem and Murthy met at MIT, and have known each other for more than seven years. When the duo started building Phonic last year, they felt there weren’t many companies crafting complete voice tech solutions.

“Voice AI is at a place where you tie up different parts, such as automatic voice recognition [and] text-to-speech, and [then integrate] intelligence,” Murthy told TechCrunch. “However, when we talked to actual customers, we found that there is a lack of [solutions] that [are] reliable at scale.”

Nadeem, who previously worked at MosaicML, a company Databricks acquired for $1.3 billion in 2023, said that a lot of companies that are building in the voice AI space (e.g. Vapi, Rounded) are creating workflows to piece together separate AI models.

Phonic takes a different approach: It trains its models in-house end-to-end. Murthy said that there are a few advantages to this.

“Owning the models allows us to deeply integrate some […] reliability pieces into the [models themselves],” he said. “If you don’t own that layer […] you’re just tying disparate pieces that don’t really fit seamlessly.”

Murthy added that Phonic’s method also allows the company to host and run models cost-efficiently. He claims that Phonic trains its models on a range of recordings, including recordings of accented and muffled speech, to make the models highly robust.

Phonic is currently working with a limited set of partners, including companies in the insurance and healthcare spaces, but plans to launch its product broadly in a few months. Soon, prospective clients will be able to try out Phonic’s tech from its website, Nadeem said.

Phonic has raised $4 million in a seed round led by Lux with participation from Replit co-founder Amjad Masad, Hugging Face co-founder Clem Delangue, Applied Intuition co-founder Qasar Younis, and Modal Labs founder Erik Bernhardsson.

Grace Isford, a partner at Lux Capital, said that the company’s in-house way of training models was appealing to the investment firm.

“We think both Moin and Nikhil are incredible technologists,” she said. “They founded [a] machine learning club at MIT. And they have worked on training models for a while now. Plus, their approach of combining diffusion and proprietary models in the voice AI sector is novel.”

Leave a Reply

Your email address will not be published. Required fields are marked *