LLM Archives - ETCentric

OpenAI’s Affordable GPT-4.1 Models Place Focus on Coding

By Paula Parisi
April 16, 2025

OpenAI has launched a new series of multimodal models dubbed GPT-4.1 that represent what the company says is a leap in small model performance, including longer context windows and improvements in coding and instruction following. Geared to developers and available exclusively via API (not through ChatGPT), the 4.1 series comes in three variations: in addition to the flagship GPT‑4.1, GPT‑4.1 mini and GPT‑4.1 nano, OpenAI’s first nano model. Unlike Web-connected models (which have “retrieval-augmented generation,” or RAG) and can access up-to-date information, they are static knowledge models. Continue reading OpenAI’s Affordable GPT-4.1 Models Place Focus on Coding

Google Ironwood TPU is Made for Inference and ‘Thinking’ AI

By Paula Parisi
April 14, 2025

Google has debuted a new accelerator chip, Ironwood, a tensor processing unit designed specifically for inference — the ability of AI to predict things. Ironwood will power Google Cloud’s AI Hypercomputer, which runs the company’s Gemini models and is gearing up for the next generation of artificial intelligence workloads. Google’s TPUs are similar to the accelerator GPUs sold by Nvidia, but unlike the GPUs they’re designed for AI and geared toward speeding neural network tasks and mathematical operations. Google says when deployed at scale Ironwood is more than 24 times more powerful than the world’s fastest supercomputer. Continue reading Google Ironwood TPU is Made for Inference and ‘Thinking’ AI

Non-Profit Sentient Launches New ‘Open Deep Search’ Model

By Paula Parisi
April 8, 2025

Sentient, a year-old non-profit backed by Peter Thiel’s Founders Fund, has released Open Deep Search (ODS), an open-source framework that leverages existing LLMs to enhance search and reasoning capabilities. Essentially a system of custom plugins and tools, ODS works with DeepSeek’s open-source R1 model as well as proprietary systems like OpenAI’s GPT-4o and Anthropic’s Claude to deliver advanced search functionality. That modular aspect is in fact ODS’s main innovation, its creators say, claiming it beats Perplexity and OpenAI’s GPT-4o Search Preview on benchmarks for accuracy and transparency. Continue reading Non-Profit Sentient Launches New ‘Open Deep Search’ Model

Elon Musk Announces xAI Corporation Will Purchase X Social

By Rob Scott
March 31, 2025

Just prior to the start of the weekend, Elon Musk announced that his artificial intelligence company xAI is acquiring his social media platform X (formerly Twitter) “in an all-stock transaction,” valuing xAI at $80 billion and X at $33 billion ($45 billion less $12 billion in debt). The merger has the potential to create a powerful GenAI-powered content platform. The billionaire purchased Twitter in late 2022 for $44 billion, following months of legal skirmishes. According to Musk, X currently touts more than 600 million active users, while “xAI has rapidly become one of the leading AI labs in the world, building models and data centers at unprecedented speed and scale.” Continue reading Elon Musk Announces xAI Corporation Will Purchase X Social

Ant Group Stacks Chips to Reduce Development Costs for AI

By Paula Parisi
March 28, 2025

China’s Ant Group is using local semiconductors to train AI at a cost that is 20 percent less than companies typically spend, according to reports. Ant used domestic chips — from companies including Alibaba, an investor in Ant, and Huawei — to launch a unique Mixture of Experts (MoE) training approach that produced results commensurate to training with Nvidia H800 chips. Ant is the latest Chinese company to focus on low cost training, joining a competition triggered by DeepSeek, which in January announced it could build AI comparable to the models released by U.S. companies like OpenAI, Anthropic and Google for billions less. Continue reading Ant Group Stacks Chips to Reduce Development Costs for AI

Baidu Releases New LLMs that Undercut Competition’s Price

By Paula Parisi
March 18, 2025

Baidu has launched two new AI systems, the native multimodal foundation model Ernie 4.5 and deep-thinking reasoning model Ernie X1. The latter supports features like generative imaging, advanced search and webpage content comprehension. Baidu is touting Ernie X1 as of comparable performance to another Chinese model, DeepSeek-R1, but says it is half the price. Both Baidu models are available to the public, including individual users, through the Ernie website. Baidu, the dominant search engine in China, says its new models mark a milestone in both reasoning and multimodal AI, “offering advanced capabilities at a more accessible price point.” Continue reading Baidu Releases New LLMs that Undercut Competition’s Price

Foxconn AI Trained in Four Weeks, Suggesting Industry Shift

By Paula Parisi
March 12, 2025

Taiwan’s Foxconn, the contract manufacturer that assembles Apple’s iPhones, has built its own AI. Called FoxBrain, the company says the large language model was trained in just four weeks with help from Nvidia, using 120 of that company’s H100 chips. FoxBrain has reasoning and mathematical skills and can analyze data and generate code. Initially built for in-house use, Foxconn says it intends to open source the model and hopes it will become a collaborative tool for its partners and enable advancements in manufacturing techniques and supply-chain management. Continue reading Foxconn AI Trained in Four Weeks, Suggesting Industry Shift

Alibaba Says Qwen Reasoning Model on Par with DeepSeek

By Paula Parisi
March 10, 2025

Alibaba is making AI news again, releasing another Qwen reasoning model, QwQ-32B, which was trained and scaled using reinforcement learning (RL). The Qwen team says it “has the potential to enhance model performance beyond conventional pretraining and post-training methods.” QwQ-32B, a 32 billion parameter model, “achieves performance comparable to DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated),” Alibaba claims. While parameters refer to the total set of adjustable weights and biases in the model’s neural network, “activated” parameters are a subset used for a specific inference task, like generating a response. Continue reading Alibaba Says Qwen Reasoning Model on Par with DeepSeek

Salesforce Brings Gemini to Agentforce in $2.5B Google Deal

By Paula Parisi
March 3, 2025

In an expansion of their existing strategic partnership, Salesforce and Google have entered into a seven-year, $2.5 billion deal that will allow Salesforce customers to build Agentforce agents using Gemini and to deploy Salesforce on Google Cloud. The companies plan to more tightly integrate connections between platforms like Salesforce Service Cloud and Google Cloud’s Customer Engagement Suite, as well as Slack and Google Workspace, “empowering AI agents and service representatives with unified data access, streamlined workflows, and advanced AI capabilities, regardless of platform,” the companies said. Continue reading Salesforce Brings Gemini to Agentforce in $2.5B Google Deal

Anthropic Introduces a New Claude Hybrid Reasoning Model

By Paula Parisi
February 26, 2025

Anthropic has released a new frontier model, Claude 3.7 Sonnet, described as the industry’s first “hybrid AI reasoning model.” The new Claude is different in that it can both respond to questions in real time or, alternatively, “think” about a problem for a prolonged period of time — basically as long as a user would like. Users can choose between “near-instant responses or extended, step-by-step thinking that is made visible to the user” by selecting the appropriate “reasoning” capability for Claude, Anthropic says. Along with the new model, Anthropic is also debuting a command line tool for agentic coding, Claude Code. Continue reading Anthropic Introduces a New Claude Hybrid Reasoning Model

YouTube Shorts Updates Dream Screen with Google Veo 2 AI

By Paula Parisi
February 19, 2025

YouTube Shorts has upgraded its Dream Screen AI background generator to incorporate Google DeepMind’s latest video model, Veo 2, which will also generate standalone video clips that users can post to Shorts. “Need a specific scene but don’t have the right footage? Want to turn your imagination into reality and tell a unique story? Simply use a text prompt to generate a video clip that fits perfectly into your narrative, or create a whole new world,” coaxes YouTube, which seems to be trying out “Dream Screen” branding as an umbrella for its genAI efforts. Continue reading YouTube Shorts Updates Dream Screen with Google Veo 2 AI

xAI Launches Grok 3 as Standalone and for X Premium+ Subs

By Paula Parisi
February 19, 2025

Elon Musk’s xAI has released its latest AI model Grok 3, which the company is describing as the “smartest AI on Earth.” It includes reasoning capabilities and a new web analysis tool called DeepSearch that returns results “within seconds” and can refine specific sources, according to xAI. Grok 3 was trained with 200,000 Nvidia GPUs, resulting in improved response times and processing power. Future capabilities will include Voice Mode for conversational interaction and audio-to-text conversion. Access to Grok 3 is limited to X Premium+ subscribers or via a SuperGrok plan (that does not include X social features). Continue reading xAI Launches Grok 3 as Standalone and for X Premium+ Subs

Gemini Recalls Previous Chats to Provide Helpful Responses

By Rob Scott
February 18, 2025

Google announced last week that its Gemini AI chatbot now offers the ability to provide responses based on earlier conversations. It can also summarize a previous chat and recall information the user has shared in other threads. “Whether you’re asking a question about something you’ve already discussed, or asking Gemini to summarize a previous conversation, Gemini now uses information from relevant chats to craft a response,” according to Google. The new feature is rolling out via Google’s $20-per-month One AI Premium Plan to start and will be available to Google Workspace Business and Enterprise customers in the coming weeks. Continue reading Gemini Recalls Previous Chats to Provide Helpful Responses

Reasoning Model Competes with Advanced AI at a Lower Cost

By Paula Parisi
February 10, 2025

Model training continues to hit new lows in terms of cost, a phenomenon known as the commoditization of AI that has rocked Wall Street. An AI reasoning model created for under $50 in cloud compute credits is reportedly performing comparably to established reasoning models such as OpenAI o1 and DeepSeek-R1 on tests of math and coding aptitude. Called s1-32B, it was created by researchers at Stanford and the University of Washington by customizing Alibaba’s Qwen2.5-32B-Instruct, feeding it 1,000 prompts with responses sourced from Google’s new Gemini 2.0 Flash Thinking Experimental reasoning model. Continue reading Reasoning Model Competes with Advanced AI at a Lower Cost

Google Adds Gemini Flash Thinking to Search, Maps and More

By Paula Parisi
February 7, 2025

Google has initiated a flurry of AI activity following the recent collection of Chinese AI releases. The Alphabet company has launched an experimental version of a new flagship AI model, Gemini 2.0 Pro. Its premiere coding and complex questions model is now available in Google AI Studio, Vertex AI and the Gemini Advanced app. The company has also made its general-purpose “workhorse” model, Gemini 2.0 Flash, available in general release via the Gemini API in AI Studio and Vertex. This follows last week’s announcement that Gemini 2.0 Flash is powering the Gemini app for desktop and mobile. Continue reading Google Adds Gemini Flash Thinking to Search, Maps and More