By
Paula ParisiApril 28, 2025
News from Adobe MAX London 2025 spanned new Firefly image models to a refreshed web app that includes third-party image generators, an AI agent that automates Photoshop, an updated Firefly mobile app coming soon to iOS and Android, and the Firefly Video model in general release. The latest release of Firefly “unifies AI-powered tools for image, video, audio, and vector generation into a single, cohesive platform and introduces many new capabilities,” according to Adobe, which says that since its debut nearly two years ago, creatives have used Firefly to generate more than 22 billion assets worldwide. Continue reading Adobe Unveils Two New Image Models and Array of Products
By
Paula ParisiApril 28, 2025
Nvidia has released NeMo microservices into general availability with version 25.4, pivoting its profile from a modular toolkit for creating custom generative AI models to emphasizing it as a platform for building AI agents at scale. As AI agents have become an in-demand commodity, Nvidia is leveraging the fact that NeMo’s capabilities seem purpose built to help them grow and thrive. Built around the Kubernetes open-source container management system, NeMo microservices are offered as “an end-to-end developer platform for creating state-of-the-art agentic AI systems,” according to Nvidia. Continue reading Nvidia Positions Its NeMo Microservices for AI Agent-Building
By
Paula ParisiApril 18, 2025
OpenAI has released two new AI models that use images as part of their reasoning process, “thinking with images.” OpenAI o3 and o4-mini “are the smartest models we’ve released to date, representing a step change in ChatGPT’s capabilities for everyone from curious users to advanced researchers,” the company says. The new entries in the “o” series also have agentic capabilities and can independently “use and combine every tool within ChatGPT, including searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images.” Continue reading OpenAI Introduces New Models That Can Reason with Images
By
Paula ParisiApril 18, 2025
Agentic AI company Moveworks has opened an AI Agent Marketplace that launches with more than 100 pre-built agents, enabling users to discover, install, and deploy AI assistants that automate business processes. Agentic AI is booming, as businesses seek to offload tasks from human workers to software. To support that, new companies and existing ones have started providing pre-built agents that are more convenient than building them from scratch. “What once took weeks to build can now be installed and deployed in mere minutes,” Moveworks says, touting its library offerings. Continue reading Moveworks Joins Competition in Offering Enterprise AI Agents
By
Paula ParisiApril 17, 2025
As enterprises rely more heavily on AI integration to compile research and summarize things like meetings and email threads, the need for contextual search has become increasingly important. AI startup Cohere has released Embed 4 to make the task easier. Embed 4 is a multimodal embedding model that transforms text, images and mixed data (like PDFs, slides or tables) into numerical representations (or “embeddings”) for tasks including semantic search, retrieval-augmented generation (RAG) and classification. Supporting over 100 languages, Embed 4 has an extremely large context window of up to 128,000 tokens. Continue reading Cohere’s Multimodal Embed Model Organizes Enterprise Data
By
Paula ParisiApril 16, 2025
OpenAI has launched a new series of multimodal models dubbed GPT-4.1 that represent what the company says is a leap in small model performance, including longer context windows and improvements in coding and instruction following. Geared to developers and available exclusively via API (not through ChatGPT), the 4.1 series comes in three variations: in addition to the flagship GPT‑4.1, GPT‑4.1 mini and GPT‑4.1 nano, OpenAI’s first nano model. Unlike Web-connected models (which have “retrieval-augmented generation,” or RAG) and can access up-to-date information, they are static knowledge models. Continue reading OpenAI’s Affordable GPT-4.1 Models Place Focus on Coding
By
Paula ParisiApril 14, 2025
Google has debuted a new accelerator chip, Ironwood, a tensor processing unit designed specifically for inference — the ability of AI to predict things. Ironwood will power Google Cloud’s AI Hypercomputer, which runs the company’s Gemini models and is gearing up for the next generation of artificial intelligence workloads. Google’s TPUs are similar to the accelerator GPUs sold by Nvidia, but unlike the GPUs they’re designed for AI and geared toward speeding neural network tasks and mathematical operations. Google says when deployed at scale Ironwood is more than 24 times more powerful than the world’s fastest supercomputer. Continue reading Google Ironwood TPU is Made for Inference and ‘Thinking’ AI
By
Paula ParisiApril 11, 2025
Google has turned its Firebase backend-as-a-service (BaaS) platform into a full-stack AI workspace called Firebase Studio that builds custom apps in a browser-based environment. Available to anyone with a Google account during its preview phase, Google says Firebase Studio will be useful to beginners and pros alike, with Gemini-powered AI agents that can be used to automate the process of building, launching and monitoring mobile and web apps and related infrastructure. Firebase Studio “includes everything developers need to create and publish production-quality AI apps quickly, all in one place,” the company announced at Google Cloud Next 2025. Continue reading Google Firebase Now Full-Stack App Developer in a Browser
By
Paula ParisiApril 11, 2025
Google’s Gemini coding assistant has gained agentic capabilities, available as part of Gemini in Android Studio, a subscription service for businesses designed to make app development for the Android ecosystem easier and more secure. This agent-centric “AI-powered cloud for developers and operators” is designed to infuse AI into all stages of application development, laying the groundwork for more rapid software creation cycles. The service is available to those who subscribe to Gemini Code Assist Standard or Enterprise editions. The new offering was unveiled at the Google Cloud Next 2025 developer conference in Las Vegas. Continue reading Google Pushes Gemini in Android Studio for App Developers
By
Paula ParisiApril 8, 2025
Sentient, a year-old non-profit backed by Peter Thiel’s Founders Fund, has released Open Deep Search (ODS), an open-source framework that leverages existing LLMs to enhance search and reasoning capabilities. Essentially a system of custom plugins and tools, ODS works with DeepSeek’s open-source R1 model as well as proprietary systems like OpenAI’s GPT-4o and Anthropic’s Claude to deliver advanced search functionality. That modular aspect is in fact ODS’s main innovation, its creators say, claiming it beats Perplexity and OpenAI’s GPT-4o Search Preview on benchmarks for accuracy and transparency. Continue reading Non-Profit Sentient Launches New ‘Open Deep Search’ Model
By
Paula ParisiApril 2, 2025
Amazon is formally rolling out its new Nova family of foundation models. Teased at the re:Invent conference hosted by AWS, details of the new multimodal series began leaking out this month. As part of the move, Amazon is diving into the agentic AI business with a new model called Nova Act, which is now in research preview. Nova Act is designed to control Web browser actions and independently tackle simple tasks. A Nova Act SDK is also being made available to allow developers to customize their own agents using the general-purpose Nova. The company is pushing for agents to help streamline business productivity. Continue reading Amazon’s Nova Model Series Includes Nova Act for AI Agents
By
Paula ParisiMarch 27, 2025
Google has released what it calls its most intelligent AI model yet, Gemini 2.5. The first 2.5 model release, an experimental version of Gemini 2.5 Pro, is a next-gen reasoning model that Google says outperformed OpenAI o3-mini and Claude 3.7 Sonnet from Anthropic on common benchmarks “by meaningful margins.” Gemini 2.5 models “are thinking models, capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy,” according to Google. The new model comes just three months after Google released Gemini 2.0 with reasoning and agentic capabilities. Continue reading Google Debuts Next-Gen Reasoning Models with Gemini 2.5
By
Paula ParisiMarch 27, 2025
Microsoft is debuting a suite of security agents for Copilot that will take over repetitive and rote tasks burdening cybersecurity teams. This next evolution of Security Copilot with AI agents is designed to autonomously assist in critical areas such as phishing, data security, and identity management. “The relentless pace and complexity of cyberattacks have surpassed human capacity and establishing AI agents is a necessity for modern security,” notes the company. Microsoft Threat Intelligence is processing 84 trillion signals per day, indicating exponential growth in cyberattacks, including 7,000 password attacks per second, the company says. Continue reading Microsoft Is Combating Security Threats with Copilot Agents
By
Paula ParisiMarch 25, 2025
Anthropic’s Claude can now search the Internet in real time, allowing it to provide timely and relevant responses that are also more accurate than what the chatbot previously offered, according to the company. Claude incorporates direct citations for its Web-retrieved material, so users can fact-check its sources. “Instead of finding search results yourself, Claude processes and delivers relevant sources in a conversational format.” While this is not exactly groundbreaking — ChatGPT, Grok 3, Copilot, Perplexity and Gemini all have real-time Web retrieval and most include citations — Claude takes a slightly different approach. Continue reading Real-Time Web Access Informs Claude 3.7 Sonnet Responses
By
Paula ParisiMarch 24, 2025
OpenAI has debuted three new models for transcription and voice generation — gpt-4o-transcribe, gpt-4o-mini-transcribe and gpt-4o-mini-tts. The text-to-speech and speech-to-text AI models are designed to help developers create AI agents with highly customizable voices. OpenAI claims these models will power natural and responsive voice agents, moving AI out of the text-based communications stage and into intuitive spoken conversations. The suite outperforms existing solutions in accuracy and reliability, OpenAI says, especially with “accents, noisy environments, and varying speech speeds,” making them well-suited for customer call centers and meeting notes. Continue reading OpenAI Pushes Conversational Agents with Three New Models