By
Paula ParisiMay 29, 2024
Music startup Suno, which leverages ChatGPT tech with the goal of emulating that app’s success in music, has raised $125 million in Series B funding, resulting in a valuation of $500 million. Founded by Harvard physics PhD turned tech entrepreneur Mikey Shulman, the company is being called “a rising star” in the realm of generative AI. Suno lets people generate original songs by using text prompts or lyrics, with the AI supplying the melodies and harmonies for fully-formed compositions. “We started Suno to build a future where anyone can make music,” according to the company. Continue reading AI Startup Suno Raises Funds to ‘Democratize Music Creation’
By
Paula ParisiMay 28, 2024
Meta Platforms has unveiled its first natively multimodal model, Chameleon, which observers say can make it competitive with frontier model firms. Although Chameleon is not yet released, Meta says internal research indicates it outperforms the company’s own Llama 2 in text-only tasks and “matches or exceeds the performance of much larger models” including Google’s Gemini Pro and OpenAI’s GPT-4V in a mixed-modal generation evaluation “where either the prompt or outputs contain mixed sequences of both images and text.” In addition, Meta calls Chameleon’s image generation “non-trivial,” noting that’s “all in a single model.” Continue reading Meta Advances Multimodal Model Architecture with Chameleon
By
Paula ParisiMay 16, 2024
Google has infused search with more Gemini AI, adding expanded AI Overviews and more planning and research capabilities. “Ask whatever’s on your mind or whatever you need to get done — from researching to planning to brainstorming — and Google will take care of the legwork” culling from “a knowledge base of billions of facts about people, places and things,” explained Google and Alphabet CEO Sundar Pichai at the Google I/O developer conference. AI Overviews will roll out to all U.S. users this week. Coming soon are customizable AI Overview options that can simplify language or add more detail. Continue reading Google Ups AI Quotient with Search-Optimized Gemini Model
Meta Platforms announced an expanded collection of generative AI features, tools and services for advertisers and businesses. The enhanced AI features include full image and text generation, text overlay capabilities, and image expansion for Reels and the Feed in Facebook and Instagram. The updated tools will be available via Meta Ads Manager through Advantage+ creative. According to Meta: “Our goal is to help you at every step of your journey, whether that’s improving ad performance by helping you develop creative variations, automating certain parts of the ad creation process, or increasing your credibility and engagement through Meta Verified.” Continue reading Meta Launches Enhanced Generative AI Tools for Advertisers
By
ETCentric StaffApril 24, 2024
Adobe plans to add generative AI capabilities to its Premiere Pro editing platform and is exploring the update with third-party AI technologies including OpenAI’s Sora, as well as models from Runway and Pika Labs, making it easier “to draw on the strengths of different models” within everyday workflows, according to Adobe. Editors will gain the ability to generate and add objects into scenes or shots, remove unwanted elements with a click, and even extend frames and footage length. The company is also developing a video model for its own Firefly AI for video and audio work in Premiere Pro. Continue reading Adobe Considers Sora, Pika and Runway AI for Premiere Pro
By
ETCentric StaffApril 18, 2024
Airchat is the latest app to take tech leaders in Silicon Valley by storm. Described as a “combination of voice notes and Twitter,” Airchat lets you follow other users and scroll through posts — adding replies, likes and shares — but the twist is the content is generated through audio recordings the app then transcribes. Airchat ranked 27th on the App Store’s social networking chart, even though users must be invited to join. Launched last year by Naval Ravikant, founder of AngelList, and erstwhile Tinder product exec Brian Norgard, Airchat was just relaunched on iOS and Android. Continue reading Audio-First Social Platform Airchat Has Successful Relaunch
By
ETCentric StaffApril 11, 2024
Google is moving its most powerful artificial intelligence model, Gemini 1.5 Pro, into public preview for developers and Google Cloud customers. Gemini 1.5 Pro includes what Google claims is a breakthrough in long context understanding, with the ability to run 1 million tokens of information “opening up new possibilities for enterprises to create, discover and build using AI.” Gemini’s multimodal capabilities allow it to process audio, video, text, code and more, which when combined with long context, “enables enterprises to do things that just weren’t possible with AI before,” according to Google. Continue reading Google Offers Public Preview of Gemini Pro for Cloud Clients
By
ETCentric StaffApril 5, 2024
OpenAI has updated the editor for DALL-E, the artificial intelligence image generator that is part of the ChatGPT premium tiers. The update, based on the DALL-E 3 model, makes it easier for users to adjust their generated images. Shortly after DALL-E 3’s September debut, OpenAI integrated it into ChatGPT, enabling paid subscribers to generate images from text or image prompts. The new DALL-E editor interface lets users edit images “by selecting an area of the image to edit and describing your changes in chat” without using the selection tool. Desired changes can also be prompted “in the conversation panel,” according to OpenAI. Continue reading OpenAI Integrates New Image Editor for DALL-E into ChatGPT
By
ETCentric StaffMarch 21, 2024
Deepgram’s new Aura software turns text into generative audio with a “human-like voice.” The 9-year-old voice recognition company has raised nearly $86 million to date on the strength of its Voice AI platform. Aura is an extremely low-latency text-to-speech voice AI that can be used for voice AI agents, the company says. Paired with Deepgram’s Nova-2 speech-to-text API, developers can use it to “easily (and quickly) exchange real-time information between humans and LLMs to build responsive, high-throughput AI agents and conversational AI applications,” according to Deepgram. Continue reading Deepgram’s Speech Portfolio Now Includes Human-Like Aura
By
ETCentric StaffMarch 19, 2024
Apple researchers have gone public with new multimodal methods for training large language models using both text and images. The results are said to enable AI systems that are more powerful and flexible, which could have significant ramifications for future Apple products. These new models, which Apple calls MM1, support up to 30 billion parameters. The researchers identify multimodal large language models (MLLMs) as “the next frontier in foundation models,” which exceed the performance of LLMs and “excel at tasks like image captioning, visual question answering and natural language inference.” Continue reading Apple Unveils Progress in Multimodal Large Language Models
By
ETCentric StaffMarch 19, 2024
Elon Musk’s xAI has released its Grok chatbot and open-sourced part of the underlying Grok-1 model architecture for any developer or entrepreneur to use for purposes including commercial applications. Musk unveiled Grok in November and announced that it would be publicly released this month. The chatbot itself is available to X social premium members, who can ask the cheeky AI questions and get answers with a snarky attitude inspired by “The Hitchhiker’s Guide to the Galaxy” sci-fi novel. The training for Grok’s foundation LLM is said to include X social posts. Continue reading Grok-1 Architecture Open-Sourced for General Release by xAI
By
ETCentric StaffMarch 8, 2024
London-based AI video startup Haiper has emerged from stealth mode with $13.8 million in seed funding and a platform that generates up to two seconds of HD video from text prompts or images. Founded by alumni from Google DeepMind, TikTok and various academic research labs, Haiper is built around a bespoke foundation model that aims to serve the needs of the creative community while the company pursues a path to artificial general intelligence (AGI). Haiper is offering a free trial of what is currently a web-based user interface similar to offerings from Runway and Pika. Continue reading AI Video Startup Haiper Announces Funding and Plans for AGI
By
Paula ParisiDecember 18, 2023
Snapchat+ is rolling out new artificial intelligence features that let subscribers use text prompts to create generative AI images to share with friends. In addition, the Dreams feature, which creates generative AI selfies, is now able to add your friends to those photos. Snapchat+ subscribers get one pack of 8 Dreams per month as part of their $3.99 monthly fee. An onscreen button labeled “AI” lets subscribers access the AI image generator to choose from a menu of prompts (including “sunny day at the beach” and “planet made of cheese”) or they can enter their own descriptions. Continue reading GenAI Lets Snapchat+ Subscribers Create and Share Images
By
Paula ParisiNovember 15, 2023
Threads, the Twitter competitor launched in July by Meta Platforms to record-breaking numbers, has added features that make it easier for users to separate their Threads feeds from Instagram and Facebook. Users can now delete their Threads accounts separate from Instagram, something that previously confounded users. Because those signing up for Threads were required to do so either from their existing or a new Instagram account, the two were entwined. Instagram/Threads CEO Adam Mosseri also announced that propagation of Threads posts to Instagram and Facebook can now be turned off, to keep discussions separate. Continue reading Threads Lets Users Delete Accounts Separate from Instagram
By
Paula ParisiNovember 15, 2023
Meta Platforms-owned instant messaging and VoIP service WhatsApp has updated its Voice Chat feature for mobile so it can now host group calls of up to 128 participants. Voice chats allow WhatsApp users to instantly talk live with members of a group chat while still being able to message within the group. The new feature, which is being compared to a Discord server, is being rolled out globally. The idea is to have the Voice Chat be less disruptive than group calling, which rings-in all group members. Voice chats can be quietly started with an in-chat bubble users tap to join. The updated version will have end-to-end encryption by default. Continue reading Meta’s WhatsApp Launches Voice Chat for Up to 128 People