By
Paula ParisiSeptember 12, 2024
Amazon is aiming to speed up production of its Audible audiobooks by inviting a small group of narrators to clone their voices using generative artificial intelligence. The U.S. beta test will roll out later this year according to Amazon, which announced the move on Audible’s creator marketplace. “There is a vast catalog of books that does not yet exist in audio and as we explore ways to bring more books to life on Audible, we’re committed to thoughtfully balancing the interests of authors, narrators, publishers, and listeners,” Amazon explains. Continue reading Amazon Is Inviting Audible Narrators to Create AI Voice Clones
By
Paula ParisiSeptember 10, 2024
YouTube is introducing AI detection tools designed to allow people to learn when their face and/or voice are copied and used in third-party videos. As part of the effort, YouTube’s existing Content ID program that protects copyrighted music will expand to include more broad-based voice simulation detection technology. The new tools aim to protect “people from a variety of industries — from creators and actors to musicians and athletes,” according to the company. The Google-owned platform is also coming up with a way to address unauthorized use of its content for training AI models. Continue reading YouTube Adding Tools to Protect Against Unauthorized AI Use
By
Paula ParisiSeptember 10, 2024
During the 10th annual Roblox Developers Conference (RDC 2024) in San Jose, the gaming platform announced it is opening to global currencies in addition to its own Robux, which generates billions in virtual transactions each year. Starting later this year, a small test bed of developers will be able to charge real money for games that charge fees, with a program expected to open “to all eligible creators by mid-2025.” The massively multiplayer online platform that lets users build online game worlds also discussed a project to develop its own AI foundation model to power generative 3D creation on the platform. Continue reading Roblox Adds Real Currency, Teases Its Coming Generative AI
By
Paula ParisiSeptember 4, 2024
An AI-powered coding app called Cursor is building a fanbase, with everyone from hobbyists to engineers subscribing to the service. The platform reportedly has 30,000 paying customers, among them employees at OpenAI, Midjourney and Perplexity. Referred to as “the ChatGPT of coding,” Cursor uses popular models including GPT-4o and Claude 3.5 Sonnet to automate building apps and other coding tasks. Cursor was launched by two-year-old startup Anysphere, which has raised more than $60 million in Series A funding led by Andreessen Horowitz and Thrive Capital. Continue reading New AI Coding App Cursor Gains Following and $60M in Funds
By
Paula ParisiAugust 30, 2024
Nvidia has had another impressive quarter. Record revenue of $30 billion in Q2 was up 122 percent from a year ago, while data center revenue of $26.3 billion marked a 154 percent increase from the same period in 2023. The performance was seen by many as an assurance of AI’s staying power, although others raised concern that if the AI companies buying chips do not start generating profits soon, the sugar high of the two-year AI boom could precede a crash. Nvidia took the occasion to tout its next-generation Blackwell chips, reassuring investors that a mid-production “tweak” would not delay release. Continue reading AI Boom Continues to Drive Strong Nvidia Revenue and Profit
By
Paula ParisiAugust 30, 2024
Google is giving Gemini Advanced, Enterprise and Business subscribers the ability to create personalized AI assistants, which the company calls “Gems.” “Create your own personal AI experts on any topic you want,” the Alphabet company says. The search giant is also reintroducing Gemini’s image generation capabilities with its latest Imagen 3 model, which will be available to everyone. Gemini, which is Google’s ChatGPT competitor, will again have the ability to generate images of people, something Google disabled in February after controversy over some of the images. The company announced it has implemented new guardrails. Continue reading Gemini Gets Custom Gems AI Assistants and Adds Imagen 3
By
Paula ParisiAugust 29, 2024
In a move toward increased transparency, San Francisco-based AI startup Anthropic has published the system prompts for three of its most recent large language models: Claude 3 Opus, Claude 3.5 Sonnet and Claude 3 Haiku. The information is now available on the web and in the Claude iOS and Android apps. The prompts are instruction sets that reveal what the models can and cannot do. Anthropic says it will regularly update the information, emphasizing that evolving system prompts do not affect the API. Examples of Claude’s prompts include “Claude cannot open URLs, links, or videos” and, when dealing with images, “avoid identifying or naming any humans.” Continue reading Anthropic Publishes Claude Prompts, Sharing How AI ‘Thinks’
By
Paula ParisiAugust 29, 2024
New York-based ElevenLabs is going global with its generative AI text-to-speech reader app, which can narrate writings in 32 languages with thousands of voices from which to choose. The audio startup promises “high quality, human-like” AI voices that are “emotionally and contextually aware,” adapting delivery of written cues “to achieve a high emotional range.” ElevenLabs has focused on “creative workflow,” with a voice isolator and audio effects generator tools. Its catalog includes the voices of celebrities Judy Garland, Laurence Olivier, James Dean and Burt Reynolds. Custom models for translation and voiceover work using contemporary actors is a future possibility. Continue reading ElevenLabs Reader App Is Available Globally in 32 Languages
By
Paula ParisiAugust 28, 2024
Adobe, OpenAI and Microsoft are among the major firms backing a California bill that would require tech companies to label AI-generated content with watermarks embedded in the metadata. Such data is easily accessible via browser for material circulated on the Internet, and the initiative would likely involve a campaign to educate the general public on how to find it. The proposed law encompasses video and audio as well as images. The three companies currently supporting the bill initially opposed it, using terms like “unworkable” and “overly burdensome.” Continue reading Bill Mandating GenAI Watermarks Gains Support in California
By
Paula ParisiAugust 27, 2024
Dropbox has purchased Reclaim.ai, a scheduling tool that uses artificial intelligence to boost productivity, popular with Google Calendar users. The privately held Reclaim announced the deal in a blog post that claims a global user base of over 43,000 companies and more than 320,000 people. Launched in 2019, Reclaim investors include Index Ventures and Calendly contributing to cash raise of more than $9.5 million to date. File-sharing app Drobox has been publicly traded since 2018 and has a current market cap of $7.92 billion. Financial terms of the deal have yet to be disclosed. Continue reading Dropbox Acquires Productivity and Scheduling App Reclaim.ai
By
Paula ParisiAugust 23, 2024
SAG-AFTRA announced it is teaming with online talent marketplace Narrativ to provide the guild’s 160,000 members with the option of working with the New York-based AI startup to license their voice replicas for use in digital audio advertising. The deal would make it easy for voice actors to be considered for replicant work and get compensated, according to SAG-AFTRA, which emphasizes that performers will control the particulars, including whether to make their voices available, brand approval and fees. Narrativ also represents visual likenesses, but the SAG-AFTRA announcement is limited to voice work. Continue reading SAG-AFTRA Strikes a Deal with Narrativ for AI Voice Replicas
By
Paula ParisiAugust 23, 2024
D-ID, a platform that uses AI to generate digital humans, has announced D-ID Video Translate in general availability. The tool lets businesses and content creators automatically re-voice videos in multiple languages, “cloning the speaker’s voice and adapting their lip movements from a single upload.” D-ID is making the Video Translate tool, which accommodates 30 different languages, free to D-ID subscribers for a limited time, available through the D-ID Studio or the company’s API. Languages include Arabic, Mandarin, Japanese, Hindi and Ukrainian, in addition to Spanish, German, French and Italian. Users can simultaneously translate content using bulk translation. Continue reading D-ID Employs AI to Translate Videos into Multiple Languages
By
Paula ParisiAugust 22, 2024
Google DeepMind has made its latest AI image generator, Imagen 3, free for use in the U.S. via the company’s ImageFX platform. Imagen 3 will be available in multiple versions, “each optimized for different types of tasks, from generating quick sketches to high-resolution images.” Google announced Imagen 3 at Google I/O in March, and in June made it available to enterprise users through Vertex. Using simplified natural language text input rather than “complex prompt engineering,” Google says Imagen 3 generates high-quality images in a range styles, from photorealistic, painterly and textured to whimsically cartoony. Continue reading Google DeepMind Releases Imagen 3 for Free to U.S. Users
By
Paula ParisiAugust 20, 2024
ByteDance has debuted a text-to-video mobile app in its native China that is available on the company’s TikTok equivalent there, Douyin. Called Jimeng AI, there is speculation that it will be coming to North America and Europe soon via TikTok or ByteDance’s CapCut editing tool, possibly beating competing U.S. technologies like OpenAI’s Sora to market. Jimeng (translation: “dream”) uses text prompts to generate short videos. For now, its responsiveness is limited to prompts written in Chinese. In addition to entertainment, the app is described as applicable to education, marketing and other purposes. Continue reading ByteDance Intros Jimeng AI Text-to-Video Generator in China
By
Paula ParisiAugust 19, 2024
Grok-2 and Grok-2 mini, the latest generative chatbots from Elon Musk’s xAI, create images with seemingly few guardrails. Early pictures of notable personalities such as Bill Gates, Donald Trump and Kamala Harris in questionable or compromising settings may not appear photorealistic to a trained eye, but they are still described in many cases to be quite realistic. Powered by the FLUX.1 AI model from Black Forest Labs, Grok-2 and Grok-2 mini are available in beta on X social for Premium and Premium+ subscribers and will be coming to xAI’s enterprise API later this month, according to the company. Continue reading xAI’s Grok-2 Generates Realistic Images with Few Guardrails