By
Paula ParisiApril 8, 2025
Meta Platforms has released its first Llama 4 models, a multimodal trio that ranges from the foundational Behemoth to tiny Scout, with Maverick in between. With 16 experts and only 17B active parameters (the number used per task), Llama Scout is “more powerful than all previous generation Llama models, while fitting in a single Nvidia H100 GPU,” according to Meta. Maverick, with 17B active parameters and 128 experts, is touted as beating GPT-4o and Gemini 2.0 Flash across various benchmarks, “while achieving comparable results to the new DeepSeek v3 on reasoning and coding with less than half the active parameters.” Continue reading Meta Unveils Multimodal Llama 4 Models, Previews Behemoth
By
Paula ParisiMarch 28, 2025
China’s Ant Group is using local semiconductors to train AI at a cost that is 20 percent less than companies typically spend, according to reports. Ant used domestic chips — from companies including Alibaba, an investor in Ant, and Huawei — to launch a unique Mixture of Experts (MoE) training approach that produced results commensurate to training with Nvidia H800 chips. Ant is the latest Chinese company to focus on low cost training, joining a competition triggered by DeepSeek, which in January announced it could build AI comparable to the models released by U.S. companies like OpenAI, Anthropic and Google for billions less. Continue reading Ant Group Stacks Chips to Reduce Development Costs for AI
By
Paula ParisiMarch 28, 2025
Alibaba Cloud has released Qwen2.5-Omni-7B, a new AI model the company claims is efficient enough to run on edge devices like mobile phones and laptops. Boasting a relatively light 7-billion parameter footprint, Qwen2.5-Omni-7B understands text, images, audio and video and generates real-time responses in text and natural speech. Alibaba says its combination of compact size and multimodal capabilities is “unique,” offering “the perfect foundation for developing agile, cost-effective AI agents that deliver tangible value, especially intelligent voice applications.” One example would be using a phone’s camera to help a vision impaired-person navigate their environment. Continue reading Alibaba’s Powerful Multimodal Qwen Model Is Built for Mobile
By
Paula ParisiMarch 27, 2025
Google has released what it calls its most intelligent AI model yet, Gemini 2.5. The first 2.5 model release, an experimental version of Gemini 2.5 Pro, is a next-gen reasoning model that Google says outperformed OpenAI o3-mini and Claude 3.7 Sonnet from Anthropic on common benchmarks “by meaningful margins.” Gemini 2.5 models “are thinking models, capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy,” according to Google. The new model comes just three months after Google released Gemini 2.0 with reasoning and agentic capabilities. Continue reading Google Debuts Next-Gen Reasoning Models with Gemini 2.5
By
Paula ParisiMarch 17, 2025
Cerebras Systems was founded 10 years ago on the belief that there would be a shortage of processors powerful enough to drive enterprise AI computing at scale. Its solution, the Cerebras Wafer-Scale Engine, is integrated into Cerebras’ CS-3 systems, which will power six new data centers launching this year that the company says will make it “the world’s number one provider of high-speed inference and the largest domestic high speed inference cloud.” Cerebras notes the new facilities will collectively serve over 40 million Llama 70B tokens per second to clients that now include Hugging Face and financial intelligence firm AlphaSense. Continue reading Cerebras Is Moving into Mainstream with New AI Data Centers
By
Paula ParisiMarch 12, 2025
Taiwan’s Foxconn, the contract manufacturer that assembles Apple’s iPhones, has built its own AI. Called FoxBrain, the company says the large language model was trained in just four weeks with help from Nvidia, using 120 of that company’s H100 chips. FoxBrain has reasoning and mathematical skills and can analyze data and generate code. Initially built for in-house use, Foxconn says it intends to open source the model and hopes it will become a collaborative tool for its partners and enable advancements in manufacturing techniques and supply-chain management. Continue reading Foxconn AI Trained in Four Weeks, Suggesting Industry Shift
By
Paula ParisiFebruary 28, 2025
Nvidia delivered stellar earnings again, with profit up 80 percent to $22.09 billion for fiscal Q4, the period that ended January 26, 2025. Record quarterly revenue hit $39.3 billion, a 12 percent uptick from Q3 and a 78 percent increase year-over-year, driven in part by sales of the company’s Blackwell AI chips. The results rebut predictions that the leading-edge chipmaker would suffer due to a recent wave of Chinese AI models created using fewer and largely older chips. That trend rocked Nvidia stock over the past quarter, but the Silicon Valley-based company managed to maintain momentum. Continue reading New Blackwell AI Chip Helps Boost Nvidia to Record Quarter
By
Paula ParisiFebruary 28, 2025
Alibaba has open-sourced its Wan 2.1 video- and image-generating AI models, heating up an already competitive space. The Wan 2.1 family, which has four models, is said to produce “highly realistic” images and videos from text and images. The company has since December been previewing a new reasoning model, QwQ-Max, indicating it will be open-sourced when fully released. The move comes after another Chinese AI company, DeepSeek, released its R1 reasoning model for free download and use, triggering demand for more open-source artificial intelligence. Continue reading Highly Realistic Alibaba GenVid Models Are Available for Free
By
Paula ParisiFebruary 26, 2025
Anthropic has released a new frontier model, Claude 3.7 Sonnet, described as the industry’s first “hybrid AI reasoning model.” The new Claude is different in that it can both respond to questions in real time or, alternatively, “think” about a problem for a prolonged period of time — basically as long as a user would like. Users can choose between “near-instant responses or extended, step-by-step thinking that is made visible to the user” by selecting the appropriate “reasoning” capability for Claude, Anthropic says. Along with the new model, Anthropic is also debuting a command line tool for agentic coding, Claude Code. Continue reading Anthropic Introduces a New Claude Hybrid Reasoning Model
By
Paula ParisiFebruary 24, 2025
Barely two weeks after the launch of its OmniHuman-1 AI model, ByteDance has released Goku, a new artificial intelligence designed to create photorealistic video featuring humanoid actors. Goku uses text prompts to create among other things, realistic product videos without the need for human actors. This last is a boon for ByteDance social media unit TikTok. Goku is open source, trained on a large dataset of roughly 36 million video-text pairs and 160 million image-text pairs. Goku’s debut is received as more bad news for OpenAI in the form of added competition, but a positive step for global enterprise. Continue reading ByteDance’s Goku Video Model Is Latest in Chinese AI Streak
By
Paula ParisiFebruary 12, 2025
OpenAI is getting close to finalizing its first custom chip design, according to an exclusive report from Reuters that emphasizes the Microsoft-backed AI giant’s goal of reducing its dependency on Nvidia chips. The blueprint for the first-generation OpenAI chip could be finalized as soon as the next few months and sent to Taiwan’s TSMC for fabrication, which will take about six months — “unless OpenAI pays substantially more for expedited manufacturing” — according to the report. Even by usual standards, the training-focused chip is already on a fast track to deployment. Continue reading OpenAI In-House Chip Could Be Ready for Testing This Year
By
Paula ParisiFebruary 10, 2025
Amazon is predicting more than $100 billion in capital expenditure for AI in 2025. The majority of that will be invested in the AWS cloud division, according to Amazon President and CEO Andy Jassy, indicating Big Tech is not planning to back down on AI. Amazon’s Q4 profit hit $20 billion, an 88 percent increase over the same period in 2023, and full year profit was $59.2 billion, a 94 percent increase, on revenue of $638 billion, an 11 percent rise. On an earnings call, Jassy said the $26.3 billion in Q4 2024 capex spending “is reasonably representative” of what the company can be expected to spend on an annualized basis this year. Continue reading AWS Cloud Computing Generates Half of Amazon’s Q4 Profits
By
Paula ParisiFebruary 7, 2025
Google has initiated a flurry of AI activity following the recent collection of Chinese AI releases. The Alphabet company has launched an experimental version of a new flagship AI model, Gemini 2.0 Pro. Its premiere coding and complex questions model is now available in Google AI Studio, Vertex AI and the Gemini Advanced app. The company has also made its general-purpose “workhorse” model, Gemini 2.0 Flash, available in general release via the Gemini API in AI Studio and Vertex. This follows last week’s announcement that Gemini 2.0 Flash is powering the Gemini app for desktop and mobile. Continue reading Google Adds Gemini Flash Thinking to Search, Maps and More
By
Paula ParisiFebruary 5, 2025
Most people know Hugging Face as a resource-sharing community, but it also builds open-source applications and tools for machine learning. Its recent release of vision-language models small enough to run on smartphones while outperforming competitors that rely on massive data centers is being hailed as “a remarkable breakthrough in AI.” The new models — SmolVLM-256M and SmolVLM-500M — are optimized for “constrained devices” with less than around 1GB of RAM, making them ideal for mobile devices including laptops and also convenient for those interested in processing large amounts of data cheaply and with a low-energy footprint. Continue reading Hugging Face Has Developed Tiny Yet Powerful Vision Models
By
Paula ParisiFebruary 3, 2025
An internecine AI battle has erupted between Alibaba and DeepSeek. Days after DeepSeek dominated several news cycles with its affordable DeepSeek-R1 reasoning model and the multimodal Janus-Pro-7B, Alibaba released its latest LLM, Qwen 2.5-Max, available via API from Alibaba Cloud. As with DeepSeek, Alibaba is looking beyond its domestic borders, but the fact that a public-facing AI battle is heating up between Chinese companies indicates the People’s Republic isn’t going to quietly cede the AI race to the U.S. Alibaba claims Qwen 2.5-Max outperforms models from DeepSeek, Meta and OpenAI. Continue reading Alibaba Plans to Take On AI Competitors with Qwen2.5-Max