By
Paul BennunDecember 4, 2023
Stability AI, developer of Stable Diffusion (one of the leading visual content generators, alongside Midjourney and DALL-E), has introduced SDXL Turbo — a new AI model that demonstrates more of the latent possibilities of the common diffusion generation approach: images that update in real time as the user’s prompt updates. This feature was always a possibility even with previous diffusion models given text and images are comprehended differently across linear time, but increased efficiency of generation algorithms and the steady accretion of GPUs and TPUs in a developer’s data center makes the experience more magical. Continue reading Stability AI Intros Real-Time Text-to-Image Generation Model
By
Paula ParisiDecember 1, 2023
Amazon is debuting its Titan Image Generator in preview for AWS Bedrock customers. The new Titan generative AI model can create new images from a text prompt or existing image, and automatically adds watermarking to protect intellectual property. The move into generative imaging puts Amazon in competition with a growing field that includes large firms like Adobe and Google. Unlike those companies and others, the e-retail giant is at present focusing exclusively on enterprise customers. Amazon Bedrock is a managed service giving developers access to a range of foundation models from companies including Meta Platforms, Anthropic, and Amazon itself. Continue reading Amazon Previews Titan Image Generator for Bedrock Clients
By
Paula ParisiNovember 20, 2023
Having made the leap from image generation to video generation over the course of a few months in 2022, Meta Platforms introduces Emu, its first visual foundational model, along with Emu Video and Emu Edit, positioned as milestones in the trek to AI moviemaking. Emu uses just two diffusion models to generate 512×512 four-second long videos at 16 frames per second, Meta said, comparing that to 2022’s Make-A-Video, which requires a “cascade” of five models. Internal research found Emu video generations were “strongly preferred” over the Make-A-Video model based on quality (96 percent) and prompt fidelity (85 percent). Continue reading Meta Touts Its Emu Foundational Model for Video and Editing
By
Paula ParisiNovember 15, 2023
Threads, the Twitter competitor launched in July by Meta Platforms to record-breaking numbers, has added features that make it easier for users to separate their Threads feeds from Instagram and Facebook. Users can now delete their Threads accounts separate from Instagram, something that previously confounded users. Because those signing up for Threads were required to do so either from their existing or a new Instagram account, the two were entwined. Instagram/Threads CEO Adam Mosseri also announced that propagation of Threads posts to Instagram and Facebook can now be turned off, to keep discussions separate. Continue reading Threads Lets Users Delete Accounts Separate from Instagram
By
Paula ParisiNovember 1, 2023
Creative image platform Shutterstock has added AI-powered editing features that provide “the potential for infinite options to refine and perfect images” in the company’s library of more than 700 million stock selections. A go-to source for brand marketers and digital media companies, Shutterstock is offering six signature AI capabilities as well as secondary features such as a virtual AI design assistant and advanced filters under the umbrella Creative AI. What’s more, Shutterstock says it will compensate its licensed artists when their images are edited with AI. Continue reading Shutterstock Offers AI Image Editor for Massive Stock Library
By
Paula ParisiOctober 31, 2023
The United Nations has formed an advisory board on artificial intelligence comprised of 39-members from government, academia and industry who will “undertake analysis and advance recommendations for the international governance of AI.” The move comes as U.S. legislators and tech industry players are also prioritizing model governance. “Globally coordinated AI governance is the only way to harness AI for humanity while addressing its risks and uncertainties,” the UN announced in unveiling the initiative, co-chaired by Carme Artigas, Spain’s secretary of state for digitalization and AI, and James Manyika, SVP of research, technology and society at Google. Continue reading Google, Microsoft, Sony Tapped for UN AI Governance Board
By
Paula ParisiOctober 30, 2023
Google is rolling out three new tools to verify images and search results. “About this image,” Fact Check Explorer and Search Generative Experience (SGE) all add context to Google Search results. “About this image” is rolling out globally to English-language users as part of the Google Search UI. Available in beta since summer, Fact Check Explorer will let journalists and professional fact checkers delve into an image or topic more deeply via API. Search Generative Experience lets GenAI investigate and share results about websites by populating source descriptions for some targets that will appear in “more about this page.” Continue reading Google Taps AI for Tools to Help Authenticate Search Results
By
Paula ParisiOctober 27, 2023
The University of Science and Technology of China (USTC) and Tencent YouTu Lab have released a research paper on a new framework called Woodpecker, designed to correct hallucinations in multimodal large language AI models. “Hallucination is a big shadow hanging over the rapidly evolving MLLMs,” writes the group, describing the phenomenon as when MLLMs “output descriptions that are inconsistent with the input image.” Solutions to date focus mainly on “instruction-tuning,” a form of retraining that is data and computation intensive. Woodpecker takes a training-free approach that purports to correct hallucinations from the basis of the generated text. Continue reading Woodpecker: Chinese Researchers Combat AI Hallucinations
By
Paula ParisiOctober 26, 2023
A new tool called Nightshade offers creators a way to fend off artificial intelligence models attempting to train on visual artwork without permission. Created by a University of Chicago team led by Professor Ben Zhao, Nightshade makes it possible to include an instruction set that can cause AI models to “break” during unauthorized scraping. It does this by inserting “invisible pixels.” As a result, popular AI models including DALL-E, Midjourney and Stable Diffusion will subsequently render erratic results, turning dogs into cats and cars into cows, and so forth. Continue reading Nightshade Data Poisoning Tool Targets AI to Protect Artist IP
By
Paula ParisiOctober 25, 2023
OpenAI is developing an AI tool that can identify images created by artificial intelligence — specifically those made in whole or part by its Dall-E 3 image generator. Calling it a “provenance classifier,” company CTO Mira Murati began publicly discussing the detection app last week but said not to expect it in general release anytime soon. This, despite Murati’s claim it is “almost 99 percent reliable.” That is still not good enough for OpenAI, which knows there is much at stake when the public perception of artists’ work can be impacted by a filter applied by AI, which is notoriously capricious. Continue reading OpenAI Developing ‘Provenance Classifier’ for GenAI Images
By
Paula ParisiOctober 23, 2023
New York-based facial recognition software company Clearview AI has had a $9.1 million fine and order to delete UK citizen data reversed by Britain’s General Regulatory Tribunal. The case against Clearview was brought by the UK Information Commissioner’s Office, which scored a victory round in May 2022, claiming Clearview violated privacy laws under the General Data Protection Regulation because it did not inform or gain consent of UK citizens before collecting their data. Clearview appealed, and the tribunal found that the selfie-scraping AI firm was not subject to the ICO’s jurisdiction due to a loophole for firms servicing foreign law enforcement. Continue reading Facial Recognition Firm Clearview AI Wins Appeal of UK Fine
By
Paula ParisiOctober 11, 2023
OpenAI began previewing vision capabilities for GPT-4 in March, and the company is now starting to roll out the image input and output to users of its popular ChatGPT. The multimodal expansion also includes audio functionality, with OpenAI proclaiming late last month that “ChatGPT can now see, hear and speak.” The upgrade vaults GPT-4 into the multimodal category with what OpenAI is apparently calling GPT-4V (for “Vision,” though equally applicable to “Voice”). “We’re rolling out voice and images in ChatGPT to Plus and Enterprise users,” OpenAI announced. Continue reading ChatGPT Goes Multimodal: OpenAI Adds Vision, Voice Ability
By
Paula ParisiOctober 6, 2023
Web-based design app Canva has raised the curtain on its AI-powered Magic Studio as part of the company’s 10-year anniversary outreach. Canva is positioning Magic Studio as collecting diverse AI tools to provide a “comprehensive AI-design platform” for business and home users that want to automate labor-intensive tasks like creating and editing images and outputting to different formats using generative artificial intelligence. Created for “the 99 percent of the world without complex design skills,” Canva’s Magic Studio offers many of the features now being built-in to smartphones and software suites, but easier and “all in one place.” Continue reading Magic Studio from Canva Offers AI Design for All Skill Levels
By
Paula ParisiOctober 4, 2023
Adobe has officially added Photoshop on the web as one of its Photoshop plans. The web version is geared to Photoshop newbies and comes complete with Adobe Firefly generative AI features including Generative Fill and Generative Expand. Adobe called it “a major milestone” since introducing Photoshop on the web in beta two years ago, starting with “an early preview of image editing capabilities.” Features now available for commercial use on the web include the ability to easily add or remove elements from any image, change a background, expand the frame, and create visuals using text-based prompts. Continue reading Adobe Launches Web Version of Photoshop with AI Features
By
Paula ParisiOctober 2, 2023
Nvidia’s Picasso continues to gain market share among visual companies looking for an AI foundry to train models for generative use. Getty Images has partnered with Nvidia to create custom foundation models for still images and video. Generative AI by Getty Images lets customers create visuals using Getty’s library of licensed photos. The tool is trained on Getty’s own creative library and has the company’s guarantee of “full indemnification for commercial use.” Getty joins Shutterstock and Adobe among enterprise clients using Picasso. Runway and Cuebric are using it, too — and Picasso is still in development. Continue reading Getty GenAI Tool for Images and Video Is Powered by Nvidia