By
Paula ParisiApril 25, 2023
Auto-GPT, an open source app that uses OpenAI’s text-generating models, is currently generating a great deal of social media attention. The program can act somewhat autonomously in that it creates its own feedback loop, asking itself a series of questions to help build a more nuanced and complete response to a text prompt. In short, something that would take a user multiple prompts to produce the desired information using ChatGPT could be accomplished using a single request of Auto-GPT, which could independently explore a subject before spitting back a comprehensive response. Continue reading Auto-GPT Generates Social Sizzle, Ushers in Era of AI Agents
By
Debra KaufmanMay 22, 2020
Facebook introduced an AI text-to-speech system (TTS) that produces a second of audio in 500 milliseconds. According to Facebook, the system, which is used with a new approach to data collection, powered the creation of a British accent-inflected voice in six months, versus over a year required for other voices. The TTS is now used for Facebook’s Portal smart display brand. The system can be hosted in real time via ordinary processors and is also available as a service for other apps, including Facebook’s VR. Continue reading Facebook Reveals New AI-Powered Text-to-Speech System
By
Debra KaufmanOctober 2, 2019
Both IBM and Google recently advanced development of Text-to-Speech (TTS) systems to create high-quality digital speech. OpenAI found that, since 2012, the compute power needed to train TTS models has exploded to more than 300,000 times. IBM created a much less compute-intensive model for speech synthesis, stating that it is able to do so in real-time and adapt to new speaking styles with little data. Google and Imperial College London created a generative adversarial network (GAN) to create high-quality synthetic speech. Continue reading Google and IBM Create Advanced Text-to-Speech Systems
By
Debra KaufmanNovember 27, 2018
Amazon is training Alexa to speak like a newscaster, a feature that will roll out in a few weeks. The new speaking style is based on Amazon’s neural text-to-speech (NTTS) developments. The new voice style doesn’t sound human, but does stress words as a TV or radio announcer would. Before creating this voice, Amazon did a survey that showed that users prefer this newscaster style when listening to articles. The new voice is also an example of “the next generation of speech synthesis,” based on machine learning. Continue reading New Alexa Speaking Style Created by Neural Text-to-Speech
By
Debra KaufmanNovember 7, 2016
Adobe Research and Princeton University are collaborating on software that acts like Photoshop for audio, including the ability to add words not found in the original audio file. Adobe developer Zeyu Jin, who spoke at the Adobe MAX conference, described the would-be product, codenamed Project VoCo, as a “sneak peak.” Project VoCo is intended to be an audio editing application, with more typical speech editing and noise cancellation features, but the Photoshop-like tool also raises potential ethical issues regarding the use of doctored audio clips.
Continue reading Adobe Project VoCo Audio Editor Offers Photoshop-Like Tools