Black Forest Labs Announces Suite of Text-to-Image Models

A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models

Canva Aims to Boost Its GenAI Efforts with Leonardo Purchase

Graphic design company Canva announced it is acquiring fellow Australian startup Leonardo AI with plans to have Leonardo’s 120 employees, including executives, join the Canva AI team. Financial terms of the deal were not disclosed. Sydney-based Leonardo has been gaining attention for its advanced generative AI platform that helps users create images and art based on the open-source Stable Diffusion model developed by Stability AI. The Leonardo team claims its offering is different than other AI art platforms since it provides users with more control. Users can experiment with text prompts and quick sketches as Leonardo.ai creates photorealistic images in real time. Continue reading Canva Aims to Boost Its GenAI Efforts with Leonardo Purchase

Mark Zuckerberg Unveils SAM 2 AI Tech for Segmenting Video

Meta Platforms CEO Mark Zuckerberg unveiled the latest version of computer vision platform SAM 2, an update on the company’s Segment Anything Model that automates for video what the original SAM did for still images — identifying the edges of an object and isolating it in the frame. Zuckerberg demonstrated SAM 2 as part of a SIGGRAPH 2024 keynote session in which he was interviewed by Nvidia CEO Jensen Huang. “Being able to do this in video and have it be zero shot and tell it what you want, it’s pretty cool,” Zuckerberg said. Meta is sharing the code and model weights for SAM 2 with a permissive Apache 2.0 license. Continue reading Mark Zuckerberg Unveils SAM 2 AI Tech for Segmenting Video

Meta Calls New Llama the First Open-Source Frontier Model

In April, Meta Platforms revealed that it was working on an open-source AI model that performed as well as proprietary models from top AI companies such as OpenAI and Anthropic. Now, Meta CEO Mark Zuckerberg says that model has arrived in the form of Llama 3.1 405B, “the first frontier-level open-source AI model.” The company is also releasing “new and improved” Llama 3.1 70B and 8B models. In addition to general cost and performance benefits, the fact that the Llama 3.1 405B model is open source “will make it the best choice for fine-tuning and distilling smaller models,” according to Meta. Continue reading Meta Calls New Llama the First Open-Source Frontier Model

Apple Launches Public Demo of Its Multimodal 4M AI Model

Apple has released a public demo of the 4M AI model it developed in collaboration with the Swiss Federal Institute of Technology Lausanne (EPFL). The technology debuts seven months after the model was first open-sourced, allowing informed observers the opportunity to interact with it and assess its capabilities. Apple says 4M was built by applying masked modeling to a single unified Transformer encoder-decoder “across a wide range of input/output modalities — including text, images, geometric and semantic modalities, as well as neural network feature maps.” Continue reading Apple Launches Public Demo of Its Multimodal 4M AI Model

Stability AI Releases Free Sound FX Tool, Stable Audio Open

Stability AI has added another audio product to its lineup, releasing the open-source text-to-audio generator Stable Audio Open 1.0 for sound design. The new model can generate up to 47 seconds of samples and sound effects, including drum beats, instrument riffs, ambient sounds, foley and production elements. It also allows for adapting variations and changing the style of audio samples. Stability AI — best known for the image generator Stable Diffusion — in September released Stable Audio, a commercial product that can generate sophisticated music tracks of up to three minutes. Continue reading Stability AI Releases Free Sound FX Tool, Stable Audio Open

Musk Said to Envision Supercomputer as xAI Raises $6 Billion

Elon Musk’s xAI has secured $6 billion in Series B funding. While the company says the funds will be “used to take xAI’s first products to market, build advanced infrastructure, and accelerate the research and development,” some outlets are reporting a significant portion is earmarked to build an AI supercomputer to power the next generation of its foundation model Grok. The company publicly released the open-source Grok-1 as a chatbot on X social in November, and recently debuted Grok-1.5 and 1.5V iterations with long-context capability and image understanding. Continue reading Musk Said to Envision Supercomputer as xAI Raises $6 Billion

Meta Advances Multimodal Model Architecture with Chameleon

Meta Platforms has unveiled its first natively multimodal model, Chameleon, which observers say can make it competitive with frontier model firms. Although Chameleon is not yet released, Meta says internal research indicates it outperforms the company’s own Llama 2 in text-only tasks and “matches or exceeds the performance of much larger models” including Google’s Gemini Pro and OpenAI’s GPT-4V in a mixed-modal generation evaluation “where either the prompt or outputs contain mixed sequences of both images and text.” In addition, Meta calls Chameleon’s image generation “non-trivial,” noting that’s “all in a single model.” Continue reading Meta Advances Multimodal Model Architecture with Chameleon

Firebase Genkit: Developer Framework for AI-Powered Apps

Google is offering developers a toolkit for incorporating generative AI features into mobile and web applications. Firebase Genkit, an open-source framework, is available now in beta. Blending models, cloud services, agents, data sources and more in a “code-centric approach” developers are used to, the Genkit makes building and debugging for AI easier, according to Google. The first release is built for JavaScript and TypeScript developers, making building AI-powered apps available to professionals who specialize in building server-side applications using the Node.js JavaScript runtime. Continue reading Firebase Genkit: Developer Framework for AI-Powered Apps

Google Adds Open-Source Gameface for Android Developers

In a move aimed at launching more accessible Android apps, Google has open-sourced code for Project Gameface, a hands-free game control feature released last year that allows users to move a computer with facial and head gestures. Developers will now have more Gameface resources with which to build Android applications for physically challenged users, “to make every Android device more accessible.” Project Gameface evolved as a collaboration with quadriplegic video game streamer Lance Carr, who has muscular dystrophy. The technology uses a smartphone’s front camera to track movement. Continue reading Google Adds Open-Source Gameface for Android Developers

IBM Introduces Granite LLMs for Enterprise Code Developers

IBM has released a family of its Granite AI models to the open-source community. The series of decoder-only Granite code models are purpose-built to write computer code for enterprise developers, with training in 116 programming languages. These Granite models range in size from 3 to 34 billion parameters in base model and instruction-tuned variants. They offer a range of uses, from modernizing older code with new languages to optimizing programs for on-device memory constraints, such as might be experienced when conforming for mobile gadgets. In addition to generation, the models can repair and explain code. Continue reading IBM Introduces Granite LLMs for Enterprise Code Developers

UK Launches New Open-Source Platform for AI Safety Testing

The UK AI Safety Institute announced the availability of its new Inspect platform designed for the evaluation and testing of artificial intelligence tech in order to help develop safe AI models. The Inspect toolset enables testers — including worldwide researchers, government agencies, and startups — to analyze the specific capabilities of such models and establish scores based on various criteria. According to the Institute, the “release comes at a crucial time in AI development, as more powerful models are expected to hit the market over the course of 2024, making the push for safe and responsible AI development more pressing than ever.” Continue reading UK Launches New Open-Source Platform for AI Safety Testing

Apple Unveils OpenELM Tech Optimized for Local Applications

The trend toward small language models that can efficiently run on a single device instead of requiring cloud connectivity has emerged as a focus for Big Tech companies involved in artificial intelligence. Apple has released the OpenELM family of open-source models as its entry in that field. OpenELM uses “a layer-wise scaling strategy” to efficiently allocate parameters within each layer of the transformer model, resulting in what Apple claims is “enhanced accuracy.” The “ELM” stands for “Efficient Language Models,” and one media outlet couches it as “the future of AI on the iPhone.” Continue reading Apple Unveils OpenELM Tech Optimized for Local Applications

Meta AI Assistant Is Launching Across Platforms with Llama 3

Pursuant to his goal of “building the world’s leading AI,” Meta Platforms CEO Mark Zuckerberg announced Friday that Meta AI is upgrading to Llama 3 concurrent with a rollout of its open-source chatbot across the company’s social platforms, integrating it into the search boxes atop WhatsApp, Instagram, Facebook and Messenger. There is also a website, meta.ai, for those who prefer browser access. Reports of Meta upgrading its social AI capabilities began leaking out early last week, albeit on a more limited test scale than what Zuckerberg announced, which, excepting Threads, is cross-platform. Continue reading Meta AI Assistant Is Launching Across Platforms with Llama 3

Google Offers Public Preview of Gemini Pro for Cloud Clients

Google is moving its most powerful artificial intelligence model, Gemini 1.5 Pro, into public preview for developers and Google Cloud customers. Gemini 1.5 Pro includes what Google claims is a breakthrough in long context understanding, with the ability to run 1 million tokens of information “opening up new possibilities for enterprises to create, discover and build using AI.” Gemini’s multimodal capabilities allow it to process audio, video, text, code and more, which when combined with long context, “enables enterprises to do things that just weren’t possible with AI before,” according to Google. Continue reading Google Offers Public Preview of Gemini Pro for Cloud Clients