By
Paula ParisiAugust 15, 2024
VFusion3D is the latest AI model unveiled by Meta Platforms, which developed it in conjunction with the University of Oxford. The powerful model, which uses single-perspective images or text prompts to generate high-quality 3D objects, is being hailed as a breakthrough in scalable 3D AI that can potentially transform sectors including VR, gaming and digital design. The platform tackles the challenge of scarce 3D training data in a world teeming with 2D images and text descriptions. The VFusion3D approach leverages what the developers call “a novel method for building scalable 3D generative models utilizing pre-trained video diffusion models.” Continue reading Meta, Oxford Advance 3D Object Generation with VFusion3D
By
Paula ParisiAugust 6, 2024
A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models
By
Paula ParisiJuly 31, 2024
After 50 years of SIGGRAPH, the conference has come full circle, from high-tech for PhDs to AI for everyone. That was Nvidia founder and CEO Jensen Huang’s message in back-to-back keynote sessions, including a Q&A with Meta CEO Mark Zuckerberg. Huang touted Universal Scene Description (OpenUSD), discussing developments aiming to speed adoption of the universal 3D data interchange framework for use in everything from robotics to the creation of “highly accurate virtual worlds for the next evolution of AI.” As Zuckerberg’s interlocutor, he prompted the Facebook founder to share a vision of AI’s personalization of social media. Continue reading Nvidia Debuts New Products to Accelerate Adoption of GenAI
By
Paula ParisiJuly 29, 2024
Stability AI has unveiled an experimental new model, Stable Video 4D, which generates photorealistic 3D video. Building on what it created with Stable Video Diffusion, released in November, this latest model can take moving image data of an object and iterate it from multiple angles — generating up to eight different perspectives. Stable Video 4D can generate five frames across eight views in about 40 seconds using a single inference, according to the company, which says the model has “future applications in game development, video editing, and virtual reality.” Users begin by uploading a single video and specifying desired 3D camera poses. Continue reading Stable Video 4D Adds Time Dimension to Generative Imagery
By
Paula ParisiJuly 24, 2024
Nvidia and French startup Mistral AI are jointly releasing a new language model called Mistral NeMo 12B that brings enterprise AI capabilities to the desktop without the need for major cloud resources. Developers can easily customize and deploy the new LLM for applications supporting chatbots, multilingual tasks, coding and summarization, according to Nvidia. “NeMo 12B offers a large context window of up to 128k tokens,” explains Mistral, adding that “its reasoning, world knowledge, and coding accuracy are state-of-the-art in its size category.” Available under the Apache 2.0 license, it is easy to implement as a drop-in replacement for Mistral 7B. Continue reading Mistral, Nvidia Bring Enterprise AI to Desktop with NeMo 12B
By
Paula ParisiJuly 3, 2024
Apple has released a public demo of the 4M AI model it developed in collaboration with the Swiss Federal Institute of Technology Lausanne (EPFL). The technology debuts seven months after the model was first open-sourced, allowing informed observers the opportunity to interact with it and assess its capabilities. Apple says 4M was built by applying masked modeling to a single unified Transformer encoder-decoder “across a wide range of input/output modalities — including text, images, geometric and semantic modalities, as well as neural network feature maps.” Continue reading Apple Launches Public Demo of Its Multimodal 4M AI Model
By
Paula ParisiJune 18, 2024
Nvidia is expanding its substantive influence in the AI sphere with Nemotron-4 340B, a family of open models designed to generate synthetic LLM training data for commercial applications across numerous fields. Through what Nvidia is calling a “uniquely permissive” free open model license, Nemotron-4 340B provides a scalable way for developers to build LLMs. Synthetic data is artificially generated data designed to mimic the characteristics and structure of data found in the real world. The offering is being called “groundbreaking” and an important step toward the democratization of artificial intelligence. Continue reading Nvidia’s Open Models to Provide Free Training Data for LLMs
By
Paula ParisiJune 7, 2024
Stability AI has added another audio product to its lineup, releasing the open-source text-to-audio generator Stable Audio Open 1.0 for sound design. The new model can generate up to 47 seconds of samples and sound effects, including drum beats, instrument riffs, ambient sounds, foley and production elements. It also allows for adapting variations and changing the style of audio samples. Stability AI — best known for the image generator Stable Diffusion — in September released Stable Audio, a commercial product that can generate sophisticated music tracks of up to three minutes. Continue reading Stability AI Releases Free Sound FX Tool, Stable Audio Open
By
Paula ParisiMay 15, 2024
IBM has released a family of its Granite AI models to the open-source community. The series of decoder-only Granite code models are purpose-built to write computer code for enterprise developers, with training in 116 programming languages. These Granite models range in size from 3 to 34 billion parameters in base model and instruction-tuned variants. They offer a range of uses, from modernizing older code with new languages to optimizing programs for on-device memory constraints, such as might be experienced when conforming for mobile gadgets. In addition to generation, the models can repair and explain code. Continue reading IBM Introduces Granite LLMs for Enterprise Code Developers
By
ETCentric StaffApril 26, 2024
The trend toward small language models that can efficiently run on a single device instead of requiring cloud connectivity has emerged as a focus for Big Tech companies involved in artificial intelligence. Apple has released the OpenELM family of open-source models as its entry in that field. OpenELM uses “a layer-wise scaling strategy” to efficiently allocate parameters within each layer of the transformer model, resulting in what Apple claims is “enhanced accuracy.” The “ELM” stands for “Efficient Language Models,” and one media outlet couches it as “the future of AI on the iPhone.” Continue reading Apple Unveils OpenELM Tech Optimized for Local Applications
By
ETCentric StaffApril 25, 2024
Microsoft, which has been developing small language models (SLMs) for some time, has announced its most-capable SLM family, Phi-3. SLMs can accomplish some of the same functions as LLMs, but are smaller and trained on less data. That smaller footprint makes them well suited to run in a local environment, which means they’re ideal for smartphones, where in theory they would not even need an Internet connection to run. Microsoft claims the Phi-3 open models can outperform “models of the same size and next size up across a variety of benchmarks that evaluate language, coding and math capabilities.” Continue reading Microsoft Small Language Models Are Ideal for Smartphones
By
ETCentric StaffApril 22, 2024
Pursuant to his goal of “building the world’s leading AI,” Meta Platforms CEO Mark Zuckerberg announced Friday that Meta AI is upgrading to Llama 3 concurrent with a rollout of its open-source chatbot across the company’s social platforms, integrating it into the search boxes atop WhatsApp, Instagram, Facebook and Messenger. There is also a website, meta.ai, for those who prefer browser access. Reports of Meta upgrading its social AI capabilities began leaking out early last week, albeit on a more limited test scale than what Zuckerberg announced, which, excepting Threads, is cross-platform. Continue reading Meta AI Assistant Is Launching Across Platforms with Llama 3
By
ETCentric StaffMarch 29, 2024
Databricks, a San Francisco-based company focused on cloud data and artificial intelligence, has released a generative AI model called DBRX that it says sets new standards for performance and efficiency in the open source category. The mixture-of-experts (MoE) architecture contains 132 billion parameters and was pre-trained on 12T tokens of text and code data. Databricks says it provides the open community and enterprises who want to build their own LLMs with capabilities previously limited to closed model APIs. Compared to other open models, Databricks claims it outperforms alternatives including Llama 2-70B and Mixtral on certain benchmarks. Continue reading Databricks DBRX Model Offers High Performance at Low Cost
By
ETCentric StaffMarch 19, 2024
Elon Musk’s xAI has released its Grok chatbot and open-sourced part of the underlying Grok-1 model architecture for any developer or entrepreneur to use for purposes including commercial applications. Musk unveiled Grok in November and announced that it would be publicly released this month. The chatbot itself is available to X social premium members, who can ask the cheeky AI questions and get answers with a snarky attitude inspired by “The Hitchhiker’s Guide to the Galaxy” sci-fi novel. The training for Grok’s foundation LLM is said to include X social posts. Continue reading Grok-1 Architecture Open-Sourced for General Release by xAI
By
ETCentric StaffFebruary 16, 2024
Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade