Hume AI Introduces Voice Control and Claude Interoperability

Artificial voice startup Hume AI has had a busy Q4, introducing Voice Control, a no-code artificial speech interface that gives users control over 10 voice dimensions ranging from “assertiveness” to “buoyancy” and “nasality.” The company also debuted an interface that “creates emotionally intelligent voice interactions” with Anthropic’s foundation model Claude that has prompted one observer to ponder the possibility that keyboards will become a thing of the past when it comes to controlling computers. Both advances expand on Hume’s work with its own foundation model, Empathic Voice Interface 2 (EVI 2), which adds emotional timbre to AI voices. Continue reading Hume AI Introduces Voice Control and Claude Interoperability

Couchbase Capella AI Helps Deploy Agents, Models, Services

Couchbase, the publicly traded data platform for developers, has launched Capella AI Services with the aim of simplifying the process of developing and deploying agentic AI apps for enterprise clients. Capella AI joins the company’s flagship Couchbase Capella cloud data platform. AI offerings include model hosting, automated vectorization, unstructured data preprocessing and AI agent catalog services. Couchbase’s goal is to “allow organizations to prototype, build, test and deploy AI agents” while giving developers control over data across the development lifecycle, including secure data mitigation for large language models running outside the organization. Continue reading Couchbase Capella AI Helps Deploy Agents, Models, Services

Lightricks LTX Video Model Impresses with Speed and Motion

Lightricks has released an AI model called LTX Video (LTXV) it says generates five seconds of 768 x 512 resolution video (121 frames) in just four seconds, outputting in less time than it takes to watch. The model can run on consumer-grade hardware and is open source, positioning Lightricks as a mass market challenger to firms like Adobe, OpenAI, Google and their proprietary systems. “It’s time for an open-sourced video model that the global academic and developer community can build on and help shape the future of AI video,” Lightricks co-founder and CEO Zeev Farbman said. Continue reading Lightricks LTX Video Model Impresses with Speed and Motion

Anthropic Protocol Intends to Standardize AI Data Integration

Anthropic is releasing what it hopes will be a new standard in data integration for AI. Called the Model Context Protocol (MCP), its goal is to eliminate the need to customize each integration by having code written each time a company’s data is connected to a model. The open-source MCP tool could become a universal way to link data sources to AI. The aim is to have models querying databases directly. MCP is “a new standard for connecting AI assistants to the systems where data lives, including content repositories, business tools, and development environments,” according to Anthropic. Continue reading Anthropic Protocol Intends to Standardize AI Data Integration

Roblox Plans an Ad Revamp as ‘Clip It’ Passes 1 Billion Views

Clip It, the Roblox social media contender in the TikTok space, has crossed the one billion views threshold, according to the company. Roblox reportedly plans to leverage that achievement by launching an ad-supported product for Clip It that merges aspects of its custom-branded Roblox game spaces with programmatic advertising. Clip It offers an endless feed of short-form video, with video ads peppered amidst the content, drawing comparisons to TikTok. Clip It currently serves the same programmatic ads as the Roblox mothership, but that is about to change, thanks to its growing popularity. Continue reading Roblox Plans an Ad Revamp as ‘Clip It’ Passes 1 Billion Views

Nvidia’s AI Blueprint Develops Agents to Analyze Visual Data

Nvidia’s growing AI arsenal now includes video search and summarization tool AI Blueprint, which helps developers build visual AI agents that analyze video and image content. The agents can answer user questions, generate summaries and even enable alerts for specific scenarios. The new feature is part of Metropolis, Nvidia’s developer toolkit for building computer vision applications using generative AI. Globally, enterprises and public organizations increasingly rely on visual information. Cameras, IoT sensors and autonomous vehicles are ingesting visual data at high rates, and visual agents can help monitor and make sense of that workflow. Continue reading Nvidia’s AI Blueprint Develops Agents to Analyze Visual Data

Microsoft, Amazon Jockey for Lead Among AI Code Assistants

Microsoft is previewing GitHub Copilot for Azure in an ambitious expansion of its AI app development toolkit that some say could fundamentally change how developers build software for the AI era. The new premise is that switching from one software to another, as developers often do, should be seamless, not disruptive — sort of a real-time language translation and integration system for code. To fend off the move by Microsoft, AWS announced it is making its Q Developer AI code assistant available as an inline chat add-on accessible from IDEs like JetBrains and Microsoft’s own Visual Studio. Continue reading Microsoft, Amazon Jockey for Lead Among AI Code Assistants

OpenAI Showcases Latest Updates for Voice, Picture and More

OpenAI unveiled major updates at its DevDay conference with the focus largely on making AI more accessible, efficient and affordable. Included were four innovations: Vision Fine-Tuning in the API, Model Distillation, Prompt Caching and the public beta of Realtime API. The approach underscores OpenAI’s effort to empower its developer ecosystem even as it continues to compete for end-users in the enterprise space. The Realtime API gives developers the option of building “nearly real-time” speech-to-speech app experiences, selecting from among six OpenAI voices. Vision Fine-Tuning for GPT-4o enables customization of the model’s visual understanding of images and text. Continue reading OpenAI Showcases Latest Updates for Voice, Picture and More

Meta Unveils New Open-Source Multimodal Model Llama 3.2

Meta’s Llama 3.2 release includes two new multimodal LLMs, one with 11 billion parameters and one with 90 billion — considered small- and medium-sized — and two lightweight, text-only models (1B and 3B) that fit onto edge and mobile devices. Included are pre-trained and instruction-tuned versions. In addition to text, the multimodal models can interpret images, supporting apps that require visual understanding. Meta says the models are free and open source. Alongside them, the company is releasing “the first official Llama Stack distributions,” enabling “turnkey deployment” with integrated safety. Continue reading Meta Unveils New Open-Source Multimodal Model Llama 3.2

Snap Targets Developers with $99 per Month AR Spectacles

Snap is rolling out its fifth generation of Spectacles — standalone AR glasses that enable use of Lenses to “experience the world together with friends.” The firm is also launching a Spectacles Developer Program, and at a rental fee of $99 per month, that’s who the devices are aimed at, for now. Spectacles are powered by Snap OS, optimized to leverage people’s natural responses to interacting with their environment. They work seamlessly with mobile devices, turning smartphones into custom game controllers with Lenses. There’s even a Spectator Mode, “so friends without Spectacles can follow along, mirror your phone screen, and more.” Continue reading Snap Targets Developers with $99 per Month AR Spectacles

OpenAI Previews New LLMs Capable of Complex Reasoning

OpenAI is previewing a new series of AI models that can reason and correct complex coding mistakes, providing a more efficient solution for developers. Powered by OpenAI o1, the new models are “designed to spend more time thinking before they respond, much like a person would,” and as a result can “solve harder problems than previous models in science, coding, and math,” OpenAI claims, noting that “through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes.” The first model in the series is being released in preview in OpenAI’s popular ChatGPT and in the company’s API. Continue reading OpenAI Previews New LLMs Capable of Complex Reasoning

Roblox Adds Real Currency, Teases Its Coming Generative AI

During the 10th annual Roblox Developers Conference (RDC 2024) in San Jose, the gaming platform announced it is opening to global currencies in addition to its own Robux, which generates billions in virtual transactions each year. Starting later this year, a small test bed of developers will be able to charge real money for games that charge fees, with a program expected to open “to all eligible creators by mid-2025.” The massively multiplayer online platform that lets users build online game worlds also discussed a project to develop its own AI foundation model to power generative 3D creation on the platform. Continue reading Roblox Adds Real Currency, Teases Its Coming Generative AI

Sony and Startale Labs Launch Soneium Blockchain for Web3

Last year Sony entered into a joint venture with Singapore-based startup Startale Labs. Now the first fruits of that collaboration have come to light, with the launch of Soneium, an Ethereum layer 2 blockchain, from Startale and Sony Block Solutions Labs. The platform is first being made available to developers, with plans for an eventual public launch, the goal being “to create new services by leveraging the various businesses and IP within the Sony Group so that Soneium becomes an infrastructure that everyone can use on a daily basis,” according to Sony. Continue reading Sony and Startale Labs Launch Soneium Blockchain for Web3

OpenAI Pushes GPT-4o Customization with Free Token Offer

OpenAI announced its newest model, GPT-4o, can now be customized. The company said that the ability to fine-tune the multimodal GPT-4o has been “one of the most requested features from developers.” Customization can move the model toward more specific structure and tone of responses or allow it to follow specific instruction sets geared toward individual use cases. Developers can now implement custom datasets, aiming for better performance at a lower cost. The ChatGPT maker is rolling out the welcome mat by offering 1 million training tokens per day “for free for every organization” through September 23. Continue reading OpenAI Pushes GPT-4o Customization with Free Token Offer

Nvidia Debuts New Products to Accelerate Adoption of GenAI

After 50 years of SIGGRAPH, the conference has come full circle, from high-tech for PhDs to AI for everyone. That was Nvidia founder and CEO Jensen Huang’s message in back-to-back keynote sessions, including a Q&A with Meta CEO Mark Zuckerberg. Huang touted Universal Scene Description (OpenUSD), discussing developments aiming to speed adoption of the universal 3D data interchange framework for use in everything from robotics to the creation of “highly accurate virtual worlds for the next evolution of AI.” As Zuckerberg’s interlocutor, he prompted the Facebook founder to share a vision of AI’s personalization of social media. Continue reading Nvidia Debuts New Products to Accelerate Adoption of GenAI