By
Paula ParisiJanuary 30, 2025
Less than a week after sending tremors through Silicon Valley and across the media landscape with an affordable large language model called DeepSeek-R1, the Chinese AI startup behind that technology has debuted another new product — the multimodal Janus-Pro-7B with an aptitude for image generation. Further mining the vein of efficiency that made R1 impressive to many, Janus-Pro-7B utilizes “a single, unified transformer architecture for processing.” Emphasizing “simplicity, high flexibility and effectiveness,” DeepSeek says Janus Pro is positioned to be a frontrunner among next-generation unified multimodal models. Continue reading DeepSeek Follows Its R1 LLM Debut with Multimodal Janus-Pro
By
Paula ParisiDecember 13, 2024
With AI powering a range of new world-building apps, 2025 could be the year the metaverse finally makes an impact. Midjourney joins the world-building club with Patchwork, a collaborate canvas for creating “infinite” fictional worlds. Now in research preview, the tool is being developed as a standalone app, though preview access requires a Midjourney Discord account linked to a Google account. Users are able to connect characters and worlds, and “share” their developing world — evolving as a “board” — with up to 100 collaborative partners on Midjourney (though the company recommends fewer participants for a more focused experience). Continue reading Midjourney Touts Collaborative World-Building App Patchwork
By
Paula ParisiDecember 12, 2024
World Labs, the AI startup co-founded by Stanford AI pioneer Fei-Fei Li, has debuted a “spatial intelligence” system that can generate 3D worlds from a single image. Although the output is not photorealistic, the tech could be a breakthrough for animation companies and video game developers. Deploying what it calls Large World Models (LWMs), World Labs is focused on transforming 2D images into turnkey 3D environments with which users can interact. Observers say that reciprocity is what sets World Labs’ technology apart from offerings by other AI companies that transform 2D to 3D. Continue reading World Labs AI Lets Users Create 3D Worlds from Single Photo
By
Paula ParisiOctober 28, 2024
Midjourney is turning heads with its new image editor, which lets users upload images and then make adjustments. The company’s models — most recently Midjourney 6.1 — accept uploaded images as a reference to use for generative results. Now the Midjourney image editor allows precise adjustments to aspects of the frame. An “image retexturing mode” is also being introduced, as is v2 of its “AI moderator.” The new features are only available to users with yearly memberships, monthly memberships for the past 12 months, or those who have generated at least 10,000 Midjourney images. Continue reading Midjourney Makes Powerful AI Image Editor Available in Alpha
By
Paula ParisiSeptember 4, 2024
An AI-powered coding app called Cursor is building a fanbase, with everyone from hobbyists to engineers subscribing to the service. The platform reportedly has 30,000 paying customers, among them employees at OpenAI, Midjourney and Perplexity. Referred to as “the ChatGPT of coding,” Cursor uses popular models including GPT-4o and Claude 3.5 Sonnet to automate building apps and other coding tasks. Cursor was launched by two-year-old startup Anysphere, which has raised more than $60 million in Series A funding led by Andreessen Horowitz and Thrive Capital. Continue reading New AI Coding App Cursor Gains Following and $60M in Funds
By
Paula ParisiAugust 22, 2024
Google DeepMind has made its latest AI image generator, Imagen 3, free for use in the U.S. via the company’s ImageFX platform. Imagen 3 will be available in multiple versions, “each optimized for different types of tasks, from generating quick sketches to high-resolution images.” Google announced Imagen 3 at Google I/O in March, and in June made it available to enterprise users through Vertex. Using simplified natural language text input rather than “complex prompt engineering,” Google says Imagen 3 generates high-quality images in a range styles, from photorealistic, painterly and textured to whimsically cartoony. Continue reading Google DeepMind Releases Imagen 3 for Free to U.S. Users
By
Paula ParisiAugust 6, 2024
A new generative AI startup called Black Forest Labs has hit the scene, debuting with a suite of text-to-image models branded FLUX.1. Based in Germany, Black Forest was founded by some of the researchers involved in developing Stable Diffusion and has raised $31 million in funding from principal investor Andreessen Horowitz and angels including CAA founder and former talent agent Michael Ovitz. The FLUX.1 suite focuses on “image detail, prompt adherence, style diversity and scene complexity,” the company says of its three initial variants: FLUX.1 [pro], FLUX.1 [dev] and FLUX.1 [schnell]. Continue reading Black Forest Labs Announces Suite of Text-to-Image Models
By
Paula ParisiAugust 5, 2024
AI media firm Runway has launched Gen-3 Alpha, building on the text-to-video model by using images to prompt realistic videos generated in seconds. Navigate to Runway’s web-based interface and click on “try Gen 3-Alpha” and you’ll land on a screen with an image uploader, as well as a text box for those who either prefer that approach or want to use natural language to tweak results. Runway lets users generate up to 10 seconds of contiguous video using a credit system. “Image to Video is major update that greatly improves the artistic control,” Runway said in an announcement. Continue reading Runway’s Gen-3 Alpha Creates Realistic Video from Still Image
By
Rob ScottAugust 1, 2024
Graphic design company Canva announced it is acquiring fellow Australian startup Leonardo AI with plans to have Leonardo’s 120 employees, including executives, join the Canva AI team. Financial terms of the deal were not disclosed. Sydney-based Leonardo has been gaining attention for its advanced generative AI platform that helps users create images and art based on the open-source Stable Diffusion model developed by Stability AI. The Leonardo team claims its offering is different than other AI art platforms since it provides users with more control. Users can experiment with text prompts and quick sketches as Leonardo.ai creates photorealistic images in real time. Continue reading Canva Aims to Boost Its GenAI Efforts with Leonardo Purchase
By
Paula ParisiJuly 29, 2024
Stability AI has unveiled an experimental new model, Stable Video 4D, which generates photorealistic 3D video. Building on what it created with Stable Video Diffusion, released in November, this latest model can take moving image data of an object and iterate it from multiple angles — generating up to eight different perspectives. Stable Video 4D can generate five frames across eight views in about 40 seconds using a single inference, according to the company, which says the model has “future applications in game development, video editing, and virtual reality.” Users begin by uploading a single video and specifying desired 3D camera poses. Continue reading Stable Video 4D Adds Time Dimension to Generative Imagery
By
Paula ParisiJuly 22, 2024
Microsoft has officially moved its AI-powered Designer app out of preview, making the Canva competitor available to iOS and Android users. The app uses text prompts to generate images and designs for items such as logos, greeting cards, stickers and invitations. Powered by OpenAI’s DALL-E 3 image model, Designer is available as an app in Windows and as a free mobile app. New capabilities include the ability to edit existing designs and the addition of “prompt templates” to help users who are starting the design process with a blank canvas. “Just describe what you want to see, and Designer can create it for you,” explains Microsoft. Continue reading Microsoft Designer Adds AI Editing, Launches Mobile Release
By
Paula ParisiJuly 11, 2024
DreamFlare has emerged from stealth to launch what is being billed as the first streaming platform for GenAI video. In addition to the consumer-facing subscription platform, the business model includes a sort of AI studio where creators can tap the expertise of professional storytellers to produce AI video using third-party tools like Runway, Sora and Midjourney. The company will feature two types of content: Flips, which are animated narratives with audio that viewers can also examine frame-by-frame, as with comic books, and Spins, described as “short movies” featuring branched narratives that provide interactive plot choices. Continue reading DreamFlare Launches AI Video Studio and Streaming Service
By
ETCentric StaffApril 12, 2024
During Google Cloud Next 2024 in Las Vegas, Google announced an updated version of its text-to-image generator Imagen 2 on Vertex AI that has the ability to generate video clips of up to four seconds. Google calls this feature “text-to-live images,” and it essentially delivers animated GIFs at 24 fps and 360×640 pixel resolution, though Google says there will be “continuous enhancements.” Imagen 2 can also generate text, emblems and logos in different languages, and has the ability to overlay those elements on existing images like business cards, apparel and products. Continue reading Google Imagen 2 Now Generates 4-Second Clips on Vertex AI
By
ETCentric StaffMarch 15, 2024
Artificial intelligence imaging service Midjourney has been embraced by storytellers who have also been clamoring for a feature that enables characters to regenerate consistently across new requests. Now Midjourney is delivering that functionality with the addition of the new “–cref” tag (short for Character Reference), available for those who are using Midjourney v6 on the Discord server. Users can achieve the effect by adding the tag to the end of text prompts, followed by a URL that contains the master image subsequent generations should match. Midjourney will then attempt to repeat the particulars of a character’s face, body and clothing characteristics. Continue reading Midjourney Creates a Feature to Advance Image Consistency
By
ETCentric StaffMarch 11, 2024
Artificial intelligence stakeholders are calling for safe harbor legal and technical protections that will allow them access to conduct “good-faith” evaluations of various AI products and services without fear of reprisal. More than 300 researchers, academics, creatives, journalists and legal professionals had as of last week signed an open letter calling on companies including Meta Platforms, OpenAI and Google to allow access for safety testing and red teaming of systems they say are shrouded in opaque rules and secrecy despite the fact that millions of consumers are already using them. Continue reading Researchers Call for Safe Harbor for the Evaluation of AI Tools