By
Paula ParisiSeptember 6, 2024
OpenAI co-founder and former chief scientist Ilya Sutskever, who exited the company in May after a power struggle with CEO Sam Altman, has raised $1 billion for his new venture, Safe Superintelligence (SSI). The cash infusion from major Silicon Valley venture capital firms including Andreessen Horowitz, Sequoia Capital, DST Global, SV Angel and NFDG has resulted in a $5 billion valuation for the startup. As its name implies, SSI is focused on developing artificial intelligence that does not pose a threat to humanity, a goal that will be pursued “in a straight shot” with “one product,” Sutskever has stated. Continue reading Safe Superintelligence Raises $1 Billion to Develop Ethical AI
By
Paula ParisiSeptember 5, 2024
China’s largest cloud computing company, Alibaba Cloud, has released a new computer vision model, Qwen2-VL, which the company says improves on its predecessor in visual understanding, including video comprehension and text-to-image processing in languages including English, Japanese, French, Spanish, Chinese and others. The company says it can analyze videos of more than 20 minutes in length and is able to respond appropriately to questions about content. Third-party benchmark tests compare Qwen2-VL favorably to leading competitors and the company is releasing two open-source versions with a larger private model to come. Continue reading Alibaba’s Latest Vision Model Has Advanced Video Capability
By
Paula ParisiAugust 26, 2024
Creating a universal definition of “open source AI” has generated a fair amount of debate and confusion, with many outfits using elastic parameters in order to achieve a fit. Now the Open Source Initiative (OSI) — “the authority that defines Open Source” — has issued what it hopes will become the baseline definition. That definition, which includes the ability to “use the system for any purpose and without having to ask for permission,” excludes a lot of AI platforms that currently describe themselves as “open,” many freely available only for non-commercial use. OSI’s remaining three parameters involve the ability to inspect the system and modify and share it. Continue reading OSI Aims for Industry Standard by Defining ‘Open Source AI’
By
Paula ParisiAugust 26, 2024
Meta Platforms CEO Mark Zuckerberg and Spotify CEO Daniel Ek have joined forces to express displeasure with the European Union’s regulations on artificial intelligence, claiming they are suppressing innovation. That is the opposite of the stated goals of EU lawmakers in passing the regulations. In a joint statement first published in The Economist and then on the Meta and Spotify websites Friday, the duo took aim at alleged EU obstruction to the development of open source AI, suggesting that Europe’s “fragmented regulatory structure, riddled with inconsistent implementation, is hampering innovation and holding back developers.” Continue reading Meta, Spotify Issue Statement Criticizing EU’s AI Regulations
By
Paula ParisiAugust 19, 2024
Grok-2 and Grok-2 mini, the latest generative chatbots from Elon Musk’s xAI, create images with seemingly few guardrails. Early pictures of notable personalities such as Bill Gates, Donald Trump and Kamala Harris in questionable or compromising settings may not appear photorealistic to a trained eye, but they are still described in many cases to be quite realistic. Powered by the FLUX.1 AI model from Black Forest Labs, Grok-2 and Grok-2 mini are available in beta on X social for Premium and Premium+ subscribers and will be coming to xAI’s enterprise API later this month, according to the company. Continue reading xAI’s Grok-2 Generates Realistic Images with Few Guardrails
By
Paula ParisiAugust 15, 2024
VFusion3D is the latest AI model unveiled by Meta Platforms, which developed it in conjunction with the University of Oxford. The powerful model, which uses single-perspective images or text prompts to generate high-quality 3D objects, is being hailed as a breakthrough in scalable 3D AI that can potentially transform sectors including VR, gaming and digital design. The platform tackles the challenge of scarce 3D training data in a world teeming with 2D images and text descriptions. The VFusion3D approach leverages what the developers call “a novel method for building scalable 3D generative models utilizing pre-trained video diffusion models.” Continue reading Meta, Oxford Advance 3D Object Generation with VFusion3D
By
Paula ParisiAugust 7, 2024
Google has unveiled three additions to its Gemma 2 family of compact yet powerful open-source AI models, emphasizing safety and transparency. The company’s Gemma 2 2B is a 2.6 billion parameter update to the lightweight 2B parameter Gemma 2, with built-in improvements in safety and performance. Built on Gemma 2, ShieldGemma is a suite of safety content classifier models that “filter the input and outputs of AI models and keep the user safe.” Interoperability model tool Gemma Scope offers what Google calls “unparalleled insight into our models’ inner workings.” Continue reading Latest Gemma 2 Models Emphasize Security and Performance
By
Paula ParisiAugust 6, 2024
A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models
By
Rob ScottAugust 1, 2024
Graphic design company Canva announced it is acquiring fellow Australian startup Leonardo AI with plans to have Leonardo’s 120 employees, including executives, join the Canva AI team. Financial terms of the deal were not disclosed. Sydney-based Leonardo has been gaining attention for its advanced generative AI platform that helps users create images and art based on the open-source Stable Diffusion model developed by Stability AI. The Leonardo team claims its offering is different than other AI art platforms since it provides users with more control. Users can experiment with text prompts and quick sketches as Leonardo.ai creates photorealistic images in real time. Continue reading Canva Aims to Boost Its GenAI Efforts with Leonardo Purchase
By
Paula ParisiJuly 31, 2024
Meta Platforms CEO Mark Zuckerberg unveiled the latest version of computer vision platform SAM 2, an update on the company’s Segment Anything Model that automates for video what the original SAM did for still images — identifying the edges of an object and isolating it in the frame. Zuckerberg demonstrated SAM 2 as part of a SIGGRAPH 2024 keynote session in which he was interviewed by Nvidia CEO Jensen Huang. “Being able to do this in video and have it be zero shot and tell it what you want, it’s pretty cool,” Zuckerberg said. Meta is sharing the code and model weights for SAM 2 with a permissive Apache 2.0 license. Continue reading Mark Zuckerberg Unveils SAM 2 AI Tech for Segmenting Video
By
Paula ParisiJuly 25, 2024
In April, Meta Platforms revealed that it was working on an open-source AI model that performed as well as proprietary models from top AI companies such as OpenAI and Anthropic. Now, Meta CEO Mark Zuckerberg says that model has arrived in the form of Llama 3.1 405B, “the first frontier-level open-source AI model.” The company is also releasing “new and improved” Llama 3.1 70B and 8B models. In addition to general cost and performance benefits, the fact that the Llama 3.1 405B model is open source “will make it the best choice for fine-tuning and distilling smaller models,” according to Meta. Continue reading Meta Calls New Llama the First Open-Source Frontier Model
By
Paula ParisiJuly 3, 2024
Apple has released a public demo of the 4M AI model it developed in collaboration with the Swiss Federal Institute of Technology Lausanne (EPFL). The technology debuts seven months after the model was first open-sourced, allowing informed observers the opportunity to interact with it and assess its capabilities. Apple says 4M was built by applying masked modeling to a single unified Transformer encoder-decoder “across a wide range of input/output modalities — including text, images, geometric and semantic modalities, as well as neural network feature maps.” Continue reading Apple Launches Public Demo of Its Multimodal 4M AI Model
By
Paula ParisiJune 7, 2024
Stability AI has added another audio product to its lineup, releasing the open-source text-to-audio generator Stable Audio Open 1.0 for sound design. The new model can generate up to 47 seconds of samples and sound effects, including drum beats, instrument riffs, ambient sounds, foley and production elements. It also allows for adapting variations and changing the style of audio samples. Stability AI — best known for the image generator Stable Diffusion — in September released Stable Audio, a commercial product that can generate sophisticated music tracks of up to three minutes. Continue reading Stability AI Releases Free Sound FX Tool, Stable Audio Open
By
Paula ParisiMay 29, 2024
Elon Musk’s xAI has secured $6 billion in Series B funding. While the company says the funds will be “used to take xAI’s first products to market, build advanced infrastructure, and accelerate the research and development,” some outlets are reporting a significant portion is earmarked to build an AI supercomputer to power the next generation of its foundation model Grok. The company publicly released the open-source Grok-1 as a chatbot on X social in November, and recently debuted Grok-1.5 and 1.5V iterations with long-context capability and image understanding. Continue reading Musk Said to Envision Supercomputer as xAI Raises $6 Billion
By
Paula ParisiMay 28, 2024
Meta Platforms has unveiled its first natively multimodal model, Chameleon, which observers say can make it competitive with frontier model firms. Although Chameleon is not yet released, Meta says internal research indicates it outperforms the company’s own Llama 2 in text-only tasks and “matches or exceeds the performance of much larger models” including Google’s Gemini Pro and OpenAI’s GPT-4V in a mixed-modal generation evaluation “where either the prompt or outputs contain mixed sequences of both images and text.” In addition, Meta calls Chameleon’s image generation “non-trivial,” noting that’s “all in a single model.” Continue reading Meta Advances Multimodal Model Architecture with Chameleon