Veo 2 Is Unveiled Weeks After Google Debuted Veo in Preview

Attempting to stay ahead of OpenAI in the generative video race, Google announced Veo 2, which it says can output 4K clips of two-minutes-plus at 4096 x 2160 pixels. Competitor Sora can generate video of up to 20 seconds at 1080p. However, TechCrunch says Veo 2’s supremacy is “theoretical” since it is currently available only through Google Labs’ experimental VideoFX platform, which is limited to videos of up to 8-seconds at 720p. VideoFX is also waitlisted, but Google says it will expand access this week (with no comment on expanding the cap). Continue reading Veo 2 Is Unveiled Weeks After Google Debuted Veo in Preview

Ray-Ban Meta Gets Live AI, RT Language Translation, Shazam

Meta has added new features to Ray-Ban Metas in time for the holidays via a firmware update that make the smart glasses “the gift that keeps on giving,” per Meta marketing.  “Live AI” adds computer vision, letting Meta AI see and record what you see “and converse with you more naturally than ever before.” Along with Live AI, Live Translation is available for Meta Early Access members. Translation of Spanish, French or Italian will pipe through as English (or vice versa) in real time as audio in the glasses’ open-ear speakers. In addition, Shazam support is added for users interested in easily identifying songs. Continue reading Ray-Ban Meta Gets Live AI, RT Language Translation, Shazam

Twelve Labs Creating AI That Can Search and Analyze Video

Twelve Labs has raised $30 million in funding for its efforts to train video-analyzing models. The San Francisco-based company has received strategic investments from notable enterprise infrastructure providers Databricks and SK Telecom as well as Snowflake Ventures and HubSpot Ventures. Twelve Labs targets customers using video across a variety of fields including media and entertainment, professional sports leagues, content creators and business users. The funding coincides with the release of Twelve Labs’ new video foundation model, Marengo 2.7, which applies a multi-vector approach to video understanding. Continue reading Twelve Labs Creating AI That Can Search and Analyze Video

Pika 2.0 Video Generator Adds Character Integration, Objects

Pika Labs has updated its generative video model, Pika 2.0 adding more user control and customizability, the company says. Improvements include better “text alignment,” making it easier to have the AI follow through with intricate prompts. Enhanced motion rendering is said to deliver more “naturalistic movement” and better physics, including greater believability in transformations that tend toward the surreal, which has typically been a challenge for genAI tools. The biggest change may be “Scene Ingredients,” which lets users add their own images when building Pika-generated videos. Continue reading Pika 2.0 Video Generator Adds Character Integration, Objects

Grok-2 Chatbot Is Now Available Free to All Users of X Social

Elon Musk’s xAI has been rolling out an updated Grok-2 model that is now available free to all users of the X social platform. Prior to last week, the “unfiltered” chatbot — which debuted in November 2023 — was available only by paid subscription. Now Grok is coming to X’s masses, but those on the free tier can only ask the chatbot 10 questions in two hours, while Premium and Premium+ users will “get higher usage limits and will be the first to access any new capabilities.” There is also now a Grok button featured on X that aims to encourage exploration. Continue reading Grok-2 Chatbot Is Now Available Free to All Users of X Social

Google Releases Gemini 2.0 in Shift Toward Agentic Era of AI

Google has introduced Gemini 2.0, the latest version of its multimodal AI model, signaling a shift toward what the company is calling “the agentic era.” The upgraded model promises not only to outperform previous iterations on standard benchmarks but also introduces more proactive, or agentic, functions. The company announced that “Project Astra,” its experimental assistant, would receive updates that allow it to use Google Search, Lens, and Maps, and that “Project Mariner,” a Chrome extension, would enable Gemini 2.0 to navigate a user’s web browser to complete tasks autonomously. Continue reading Google Releases Gemini 2.0 in Shift Toward Agentic Era of AI

OpenAI Releases Sora, Adding It to ChatGPT Plus, Pro Plans

Ten months after its preview, OpenAI has officially released a Sora video model called Sora Turbo. Described as “hyperrealistic,” Sora Turbo generates clips of 10 to 20 seconds from text or image inputs. It outputs video in widescreen, vertical or square aspect ratios at resolutions from 480p to 1080p. The new product is being made available to ChatGPT Plus and Pro subscribers ($20 and $200 per month, respectively) but is not yet included with ChatGPT Team, Enterprise, or Edu plans, or available to minors. The company explains that Sora videos contain C2PA⁠ metadata indicating that they were generated by AI. Continue reading OpenAI Releases Sora, Adding It to ChatGPT Plus, Pro Plans

Meta’s Llama 3.3 Delivers More Processing for Less Compute

Meta Platforms has packed more artificial intelligence into a smaller package with Llama 3.3, which the company released last week. The open-source large language model (LLM) “improves core performance at a significantly lower cost, making it even more accessible to the entire open-source community,” Meta VP of Generative AI Ahmad Al-Dahle wrote on X social. The 70 billion parameter text-only Llama 3.3 is said to perform on par with the 405 billion parameter model that was part of Meta’s Llama 3.1 release in July, with less computing power required, significantly lowering its operational costs. Continue reading Meta’s Llama 3.3 Delivers More Processing for Less Compute

Perplexity Expands Publisher Program Despite AI Controversy

Perplexity is expanding the publisher program associated with its AI-powered search engine. Added to the list of participants who will share ad revenue and access performance data are Adweek, LA Times, Mexico News Daily, The Independent, Germany’s Stern, the World Encyclopedia and about 10 other media brands. They join existing partners including Time, Fortune and Der Spiegel. Emphasizing its ongoing investment in publishers, Perplexity named Jessica Chan, formerly with LinkedIn and its content partner program, as head of publisher partnerships. News of Perplexity’s content deals appears to be generating mixed feelings in newsrooms. Continue reading Perplexity Expands Publisher Program Despite AI Controversy

OpenAI Announces $200 Monthly Subscription for ChatGPT Pro

OpenAI has launched ChatGPT Pro, a $200 per month subscription plan that provides unlimited access to the full version of o1, its new large reasoning model, and all other OpenAI models. The toolkit includes o1-mini, GPT-4o and Advanced Voice. It also includes the new o1 pro mode, “a version of o1 that uses more compute to think harder and provide even better answers to the hardest problems,” OpenAI explains, describing the high-end subscription plan as a path to “research-grade intelligence” for a way for scientists, engineers, enterprise, academics and others who use AI to accelerate productivity. Continue reading OpenAI Announces $200 Monthly Subscription for ChatGPT Pro

Amazon Dives into Generative AI with Nova Foundation Models

After years of focusing on AI infrastructure, Amazon is plunging into the frontier model business with the Nova series. The new family of generative AI models includes the text-to-text model Amazon Nova Micro and Amazon Nova Lite for fast, mobile-friendly apps, and at the upper echelon the multimodal Amazon Nova Pro and Amazon Nova Premier for processing text, images and video. Amazon, which is heavy into production via Amazon Studios and MGM, is also launched two specialty models focused on “studio quality” output — Amazon Nova Canvas for images and Amazon Nova Reel for video. Continue reading Amazon Dives into Generative AI with Nova Foundation Models

Hume AI Introduces Voice Control and Claude Interoperability

Artificial voice startup Hume AI has had a busy Q4, introducing Voice Control, a no-code artificial speech interface that gives users control over 10 voice dimensions ranging from “assertiveness” to “buoyancy” and “nasality.” The company also debuted an interface that “creates emotionally intelligent voice interactions” with Anthropic’s foundation model Claude that has prompted one observer to ponder the possibility that keyboards will become a thing of the past when it comes to controlling computers. Both advances expand on Hume’s work with its own foundation model, Empathic Voice Interface 2 (EVI 2), which adds emotional timbre to AI voices. Continue reading Hume AI Introduces Voice Control and Claude Interoperability

Qwen with Questions: Alibaba Previews New Reasoning Model

Alibaba Cloud has released the latest entry in its growing Qwen family of large language models. The new Qwen with Questions (QwQ) is an open-source competitor to OpenAI’s o1 reasoning model. As with competing large reasoning models (LRMs), QwQ can correct its own mistakes, relying on extra compute cycles during inference to assess its responses, making it well suited for reasoning tasks like math and coding. Described as an “experimental research model,” this preview version of QwQ has 32-billion-parameters and a 32,000-token context, leading to speculation that a more powerful iteration is in the offing. Continue reading Qwen with Questions: Alibaba Previews New Reasoning Model

Luma AI Upgrades Its Video Generator and Adds Image Model

Anticipating what one outlet calls “the likely imminent release of OpenAI’s Sora,” generative AI video competitors are compelled to step up their game. Luma AI has released a major upgrade to its Dream Machine, speeding its already quick video generation and enabling a chat function for natural language prompts, so you can talk to it as with OpenAI’s ChatGPT. In addition to the new interface, Dream Machine is going mobile and adding a new foundation image model, Luma AI Photon, which “has been purpose built to advance the power and capabilities of Dream Machine,” according to the company. Continue reading Luma AI Upgrades Its Video Generator and Adds Image Model

Lightricks LTX Video Model Impresses with Speed and Motion

Lightricks has released an AI model called LTX Video (LTXV) it says generates five seconds of 768 x 512 resolution video (121 frames) in just four seconds, outputting in less time than it takes to watch. The model can run on consumer-grade hardware and is open source, positioning Lightricks as a mass market challenger to firms like Adobe, OpenAI, Google and their proprietary systems. “It’s time for an open-sourced video model that the global academic and developer community can build on and help shape the future of AI video,” Lightricks co-founder and CEO Zeev Farbman said. Continue reading Lightricks LTX Video Model Impresses with Speed and Motion