Model Training Archives

OpenAI Reportedly Has Prototype for Its Own Social Network

By Paula Parisi
April 17, 2025

OpenAI is working to build a social network that will compete against Elon Musk’s X and Meta’s Instagram, reports say. Though still in the early stages, the project is revolving around an internal prototype that is said to involve a social feed that leverages ChatGPT’s image generator. It’s unclear if an OpenAI social app would be standalone or integrated with ChatGPT, but either way it would most likely heighten the competition between rivals Musk and OpenAI CEO Sam Altman, who recently fended off an unsolicited offer by Musk to purchase his company for $97.4 billion. Continue reading OpenAI Reportedly Has Prototype for Its Own Social Network

Researchers Debut Preview of DeepCoder Reasoning Model

By Paula Parisi
April 15, 2025

A new open-source code reasoning model called DeepCoder-14B-Preview has hit the market. Built atop DeepSeek-R1 and Qwen2.5 using reinforcement learning (RL), it aims to provide more flexibility by combining high-performance code generation with reasoning capabilities for real-world applications. Its performance is said to be comparable to OpenAI’s o3-mini, “but with a smaller footprint,” say its developers, the research-driven AI companies Together AI and Agentica. “We democratize the recipe for training a small model into a strong competitive coder,” explains Together AI. Continue reading Researchers Debut Preview of DeepCoder Reasoning Model

Deep Cogito Is Out of Stealth with Hybrid Reasoning Models

By Paula Parisi
April 10, 2025

San Francisco-based AI startup Deep Cogito has released five AI models in preview, making them available under an open-source license agreement. The models come in sizes 3B, 8B, 14B, 32B and 70B, with plans to release 109B, 400B and 671B versions in the weeks and months ahead. As for the current models, “each outperforms the best available open models of the same size, including counterparts from Meta, DeepSeek and Alibaba, across most standard benchmarks,” Deep Cogito claims, noting that the 70B model in particular “outperforms the newly released Llama 4 109B MoE model.” Continue reading Deep Cogito Is Out of Stealth with Hybrid Reasoning Models

OpenAI Delivers Native GPT-4o Image Generator to ChatGPT

By Paula Parisi
March 27, 2025

OpenAI has activated the multimodal image generation capabilities of GPT-4o, making it available to ChatGPT users on the Plus, Pro, Team and Free tiers. It replaces DALL-E 3 as the default image generator for the popular chatbot. GPT-4o’s accuracy with text, understanding of symbols and precision with prompts combined with well multimodal capabilities that allow the model to take cues from visual material have transformed its image capabilities from largely unpredictable to “consistent and context-aware,” resulting in “a practical tool with precision and power,” claims OpenAI. Continue reading OpenAI Delivers Native GPT-4o Image Generator to ChatGPT

With Hotshot Purchase, xAI to Bring Generative Video to Grok

By Paula Parisi
March 19, 2025

Elon Musk’s xAI has acquired generative video startup Hotshot to bring motion imaging to Grok 3. Released in February, Grok 3 adds Deep Search and Thinking and improved on its predecessor’s still imaging capabilities, but lacks generative video, a much-requested feature — one that could make Grok a freestanding competitor to OpenAI’s individual offerings: ChatGPT for text, Sora for video, and DALL-E for images. “Cool AI video coming soon!” was Musk’s comment to Hotshot’s acquisition announcement on the networking platform. Hotshot can generate clips of up to 10-seconds at 1280×720 pixels. Continue reading With Hotshot Purchase, xAI to Bring Generative Video to Grok

OpenAI and Google Press for Relief on Copyright, State Laws

By Paula Parisi
March 17, 2025

OpenAI is urging the Trump Administration to declare AI training fair use, seeking unfettered access to copyrighted material for the purpose of educating models. The company is also asking for relief from state AI rules and more permissive AI export rules in a response to President Trump’s call for a U.S. “AI Action Plan.” The deadline to submit responses to the National Science Foundation and Office of Science & Technology Policy (OSTP) request for information (RFI) regarding the plan was Saturday. Google also publicized its response, which largely echoed OpenAI’s points. Continue reading OpenAI and Google Press for Relief on Copyright, State Laws

Meta Tests New AI Accelerator Chip Designed with Broadcom

By Paula Parisi
March 13, 2025

Meta Platforms has reportedly begun “a small deployment” of its first in-house chip designed for AI training. The accelerator chip is engineered around the open-standard RISC-V architecture. TSMC produced the working samples now being tested. The goal is to create purpose-specific chips that are more efficient than Nvidia’s general purpose GPUs, enjoying the cost-savings that would come with wide use and reducing reliance on outside chip suppliers in a tight market. If the tests go well, Meta plans to scale up production for expanded use by 2026. Details of the new chip’s specifications remain unknown at this time. Continue reading Meta Tests New AI Accelerator Chip Designed with Broadcom

Foxconn AI Trained in Four Weeks, Suggesting Industry Shift

By Paula Parisi
March 12, 2025

Taiwan’s Foxconn, the contract manufacturer that assembles Apple’s iPhones, has built its own AI. Called FoxBrain, the company says the large language model was trained in just four weeks with help from Nvidia, using 120 of that company’s H100 chips. FoxBrain has reasoning and mathematical skills and can analyze data and generate code. Initially built for in-house use, Foxconn says it intends to open source the model and hopes it will become a collaborative tool for its partners and enable advancements in manufacturing techniques and supply-chain management. Continue reading Foxconn AI Trained in Four Weeks, Suggesting Industry Shift

Pinterest AI Labeling Policy Unveiled as Q4 Earnings Top $1B

By Paula Parisi
March 12, 2025

Popular social media platform Pinterest is now labeling generative AI content. The app, which earned a reputation as fertile ground for design inspiration related to hand-crafted goods and human artistry, has recently been plagued by an onslaught of “AI slop,” something its regular users have been complaining of on Reddit and to Pinterest directly. The GenAI content was often used to redirect people to spammy sites, according to a recent report. Pinterest’s labeling news coincides with an earnings report of $1.15 billion in Q4 revenue, marking an 18 percent increase year-over-year. Continue reading Pinterest AI Labeling Policy Unveiled as Q4 Earnings Top $1B

Google Updates AI Search and Intros Gemini Text Embedding

By Paula Parisi
March 10, 2025

Google has added Gemini Embedding to its Gemini developer API. This new experimental model for text translates words, phrases and other text inputs into numerical representations, otherwise known as embeddings, which capture their semantic meaning. Embeddings are used in a wide range of applications including document retrieval and classification, potentially reducing costs and improving latency. Google is also testing an expansion of its AI Overviews search feature as part of a Gemini 2.0 update. Called AI Mode, it helps explain complex topics by generating search results that use advanced reasoning and thinking capabilities. Continue reading Google Updates AI Search and Intros Gemini Text Embedding

Alibaba Says Qwen Reasoning Model on Par with DeepSeek

By Paula Parisi
March 10, 2025

Alibaba is making AI news again, releasing another Qwen reasoning model, QwQ-32B, which was trained and scaled using reinforcement learning (RL). The Qwen team says it “has the potential to enhance model performance beyond conventional pretraining and post-training methods.” QwQ-32B, a 32 billion parameter model, “achieves performance comparable to DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated),” Alibaba claims. While parameters refer to the total set of adjustable weights and biases in the model’s neural network, “activated” parameters are a subset used for a specific inference task, like generating a response. Continue reading Alibaba Says Qwen Reasoning Model on Par with DeepSeek

Adobe Firefly Video Now in Public Beta Starting at $10 Month

By Paula Parisi
February 14, 2025

Adobe’s Firefly video is now in public beta as part of Firefly AI, now multi-modal with video, image and vector generation. Available for $10 for Firefly Standard or $30 for Firefly Pro, the Firefly app offers additional tiers for premium video and audio features, offering a degree of customization based on project needs. Adobe continues to position Firefly as “the only generative AI model that is IP-friendly and commercially safe,” offering the option of contractual IP indemnification to protect against infringement lawsuits “in the unlikely event of a claim involving a Firefly output.” Continue reading Adobe Firefly Video Now in Public Beta Starting at $10 Month

ByteDance’s AI Model Can Generate Video from Single Image

By Paula Parisi
February 6, 2025

ByteDance has developed a generative model that can use a single photo to generate photorealistic video of humans in motion. Called OmniHuman-1, the multimodal system supports various visual and audio styles and can generate people doing things like singing, dancing, speaking and moving in a natural fashion. ByteDance says its new technology clears hurdles that hinder existing human-generators — obstacles like short play times and over-reliance on high-quality training data. The diffusion transformer-based OmniHuman addressed those challenges by mixing motion-related conditions into the training phase, a solution ByteDance researchers claim is new. Continue reading ByteDance’s AI Model Can Generate Video from Single Image

Alibaba Plans to Take On AI Competitors with Qwen2.5-Max

By Paula Parisi
February 3, 2025

An internecine AI battle has erupted between Alibaba and DeepSeek. Days after DeepSeek dominated several news cycles with its affordable DeepSeek-R1 reasoning model and the multimodal Janus-Pro-7B, Alibaba released its latest LLM, Qwen 2.5-Max, available via API from Alibaba Cloud. As with DeepSeek, Alibaba is looking beyond its domestic borders, but the fact that a public-facing AI battle is heating up between Chinese companies indicates the People’s Republic isn’t going to quietly cede the AI race to the U.S. Alibaba claims Qwen 2.5-Max outperforms models from DeepSeek, Meta and OpenAI. Continue reading Alibaba Plans to Take On AI Competitors with Qwen2.5-Max

Chinese AI Startup DeepSeek Disrupting the U.S. Tech Sector

By Paula Parisi
January 28, 2025

Hangzhou-based AI firm DeepSeek is roiling the U.S. tech sector and upending financial markets. The startup has managed to become competitive with Silicon Valley’s deep learning firms despite U.S. sanctions that prevent Chinese technology companies from buying premium chips. DeepSeek has made it into the global top 10 in terms of model performance, and as of this week had the top-ranked free AI assistant at the Apple App Store. DeepSeek’s new R1 model has drawn attention for using less computing power than competing systems, while performing comparably, despite having been developed using older Nvidia chips. Continue reading Chinese AI Startup DeepSeek Disrupting the U.S. Tech Sector