MoE Archives - ETCentric

Ant Group Stacks Chips to Reduce Development Costs for AI

By Paula Parisi
March 28, 2025

China’s Ant Group is using local semiconductors to train AI at a cost that is 20 percent less than companies typically spend, according to reports. Ant used domestic chips — from companies including Alibaba, an investor in Ant, and Huawei — to launch a unique Mixture of Experts (MoE) training approach that produced results commensurate to training with Nvidia H800 chips. Ant is the latest Chinese company to focus on low cost training, joining a competition triggered by DeepSeek, which in January announced it could build AI comparable to the models released by U.S. companies like OpenAI, Anthropic and Google for billions less. Continue reading Ant Group Stacks Chips to Reduce Development Costs for AI

Alibaba Plans to Take On AI Competitors with Qwen2.5-Max

By Paula Parisi
February 3, 2025

An internecine AI battle has erupted between Alibaba and DeepSeek. Days after DeepSeek dominated several news cycles with its affordable DeepSeek-R1 reasoning model and the multimodal Janus-Pro-7B, Alibaba released its latest LLM, Qwen 2.5-Max, available via API from Alibaba Cloud. As with DeepSeek, Alibaba is looking beyond its domestic borders, but the fact that a public-facing AI battle is heating up between Chinese companies indicates the People’s Republic isn’t going to quietly cede the AI race to the U.S. Alibaba claims Qwen 2.5-Max outperforms models from DeepSeek, Meta and OpenAI. Continue reading Alibaba Plans to Take On AI Competitors with Qwen2.5-Max

MIT Spinoff Liquid Eschews GPTs for Its Fluid Approach to AI

By Paula Parisi
October 2, 2024

AI startup Liquid, founded by alums of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), has released its first models. Called Liquid Foundation Models, or LFMs, the multimodal family approaches “intelligence” differently than the pre-trained transformer models that dominate the field. Instead, the LFMs take a path of “first principles,” which MIT describes as “the same way engineers build engines, cars, and airplanes,” explaining that the models are large neural networks with computational units “steeped in theories of dynamic systems, signal processing and numeric linear algebra.” Continue reading MIT Spinoff Liquid Eschews GPTs for Its Fluid Approach to AI

Databricks DBRX Model Offers High Performance at Low Cost

By ETCentric Staff
March 29, 2024

Databricks, a San Francisco-based company focused on cloud data and artificial intelligence, has released a generative AI model called DBRX that it says sets new standards for performance and efficiency in the open source category. The mixture-of-experts (MoE) architecture contains 132 billion parameters and was pre-trained on 12T tokens of text and code data. Databricks says it provides the open community and enterprises who want to build their own LLMs with capabilities previously limited to closed model APIs. Compared to other open models, Databricks claims it outperforms alternatives including Llama 2-70B and Mixtral on certain benchmarks. Continue reading Databricks DBRX Model Offers High Performance at Low Cost