By
ETCentric StaffMarch 15, 2024
Artificial intelligence imaging service Midjourney has been embraced by storytellers who have also been clamoring for a feature that enables characters to regenerate consistently across new requests. Now Midjourney is delivering that functionality with the addition of the new “–cref” tag (short for Character Reference), available for those who are using Midjourney v6 on the Discord server. Users can achieve the effect by adding the tag to the end of text prompts, followed by a URL that contains the master image subsequent generations should match. Midjourney will then attempt to repeat the particulars of a character’s face, body and clothing characteristics. Continue reading Midjourney Creates a Feature to Advance Image Consistency
By
ETCentric StaffMarch 14, 2024
Having fended off challenges in the short-form video sphere since its late 2016 launch, it now appears TikTok is playing offense, laying the groundwork for a photo-sharing app that has drawn comparisons to Instagram and Pinterest. Avid TikTok users are probably familiar with a feature that lets them post still images as moving images that can be examined by advancing frame-by-frame. Now TikTok seems to want to improve that approach by building a separate TikTok Photos app to which users of the primary platform can export and showcase their still images to Android and iOS. Continue reading TikTok Updates Its Code to Sync to Separate ‘TikTok Photos’
By
ETCentric StaffMarch 11, 2024
Alibaba is touting a new artificial intelligence system that can animate portraits, making people sing and talk in realistic fashion. Researchers at the Alibaba Group’s Institute for Intelligent Computing developed the generative video framework, calling it EMO, short for Emote Portrait Alive. Input a single reference image along with “vocal audio,” as in talking or singing, and “our method can generate vocal avatar videos with expressive facial expressions and various head poses,” the researchers say, adding that EMO can generate videos of any duration, “depending on the length of video input.” Continue reading Alibaba’s EMO Can Generate Performance Video from Images
By
ETCentric StaffMarch 8, 2024
London-based AI video startup Haiper has emerged from stealth mode with $13.8 million in seed funding and a platform that generates up to two seconds of HD video from text prompts or images. Founded by alumni from Google DeepMind, TikTok and various academic research labs, Haiper is built around a bespoke foundation model that aims to serve the needs of the creative community while the company pursues a path to artificial general intelligence (AGI). Haiper is offering a free trial of what is currently a web-based user interface similar to offerings from Runway and Pika. Continue reading AI Video Startup Haiper Announces Funding and Plans for AGI
By
ETCentric StaffFebruary 16, 2024
Apple has taken a novel approach to animation with Keyframer, using large language models to add motion to static images through natural language prompts. “The application of LLMs to animation is underexplored,” Apple researchers say in a paper that describes Keyframer as an “animation prototyping tool.” Based on input from animators and engineers, Keyframer lets users refine their work through “a combination of prompting and direct editing,” the paper explains. The LLM can generate CSS animation code. Users can also use natural language to request design variations. Continue reading Apple’s Keyframer AI Tool Uses LLMs to Prototype Animation
By
ETCentric StaffFebruary 16, 2024
Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade
By
ETCentric StaffFebruary 9, 2024
Apple has released MGIE, an open-source AI model that edits images using natural language instructions. MGIE, short for MLLM-Guided Image Editing, can also modify and optimize images. Developed in conjunction with University of California Santa Barbara, MGIE is Apple’s first AI model. The multimodal MGIE, which understands text and image input, also crops, resizes, flips, and adds filters based on text instructions using what Apple says is an easier instruction set than other AI editing programs, and is simpler and faster than learning a traditional program, like Apple’s own Final Cut Pro. Continue reading Apple Launches Open-Source Language-Based Image Editor
By
Paula ParisiFebruary 7, 2024
Yelp is introducing more than 20 new updates to improve the experience for community members and business owners. Included are AI-powered summaries that make it easier to find businesses, an updated Yelp Elite badge for reviewers who are passionate about specific subjects, and a new visual home feed and search experience geared toward discovery. For those seeking services, the new “Request a Quote” and “Projects” features are available. Artificial intelligence will also power market and competitive insights for business owners, while AI-powered smart budgets provide recommendations to optimize ad spend, “helping local businesses grow.” Continue reading Yelp Adds 20 Features Plus AI to Help Users and Businesses
By
Paula ParisiJanuary 31, 2024
AI copyright infringement tool Nightshade generated 250,000 downloads shortly after its January release, exceeding the expectations of its creators in the computer science department at the University of Chicago. Nightshade allows artists to thwart AI models from scraping and training on their work without consent. The Bureau of Labor Statistics shows more than 2.67 million artists working in the U.S., but social media feedback indicates the downloads have been worldwide. One of the coders says cloud mirror links had to be added so as not to overwhelm the University of Chicago’s web servers. Continue reading AI Poison Pill App Nightshade Has 250K Downloads in 5 Days
By
Phil LelyveldJanuary 10, 2024
Dr. Fei-Fei Li, Stanford professor and co-director of Stanford HAI (Human-Centered AI), and Andrew Ng, venture capitalist and managing general partner at Palo Alto-based AI Fund discussed the current state and expected near-term developments in artificial intelligence. As a general purpose technology, AI development will both deepen, as private sector LLMs are developed for industry-specific needs, and broaden, as open source public sector LLMs emerge to address broad societal problems. Expect exciting advances in image models — what Li calls “pixel space.” When implementing AI, think about teams rather than individuals, and think about tasks rather than jobs. Continue reading CES: Session Details the Impact and Future of AI Technology
By
Paula ParisiDecember 21, 2023
As the pressure ratchets up for AI companies to go beyond the wow factor and make money, Stability AI has formalized three subscription tiers as it seeks to expand commercial use of its open-source, multimodal core models. The Stability AI Membership offerings include a free tier for personal and research (i.e., non-commercial) use, a professional tier that costs $20 a month, and a custom-priced enterprise tier for large outfits. The company says that with the three tiers it is “striking a balance between fostering competitiveness and maintaining openness in AI technologies.” Continue reading Stability AI Is Offering Paid Membership for Commercial Users
By
Paula ParisiDecember 18, 2023
Snapchat+ is rolling out new artificial intelligence features that let subscribers use text prompts to create generative AI images to share with friends. In addition, the Dreams feature, which creates generative AI selfies, is now able to add your friends to those photos. Snapchat+ subscribers get one pack of 8 Dreams per month as part of their $3.99 monthly fee. An onscreen button labeled “AI” lets subscribers access the AI image generator to choose from a menu of prompts (including “sunny day at the beach” and “planet made of cheese”) or they can enter their own descriptions. Continue reading GenAI Lets Snapchat+ Subscribers Create and Share Images
By
Paul BennunDecember 4, 2023
Stability AI, developer of Stable Diffusion (one of the leading visual content generators, alongside Midjourney and DALL-E), has introduced SDXL Turbo — a new AI model that demonstrates more of the latent possibilities of the common diffusion generation approach: images that update in real time as the user’s prompt updates. This feature was always a possibility even with previous diffusion models given text and images are comprehended differently across linear time, but increased efficiency of generation algorithms and the steady accretion of GPUs and TPUs in a developer’s data center makes the experience more magical. Continue reading Stability AI Intros Real-Time Text-to-Image Generation Model
By
Paula ParisiDecember 1, 2023
Amazon is debuting its Titan Image Generator in preview for AWS Bedrock customers. The new Titan generative AI model can create new images from a text prompt or existing image, and automatically adds watermarking to protect intellectual property. The move into generative imaging puts Amazon in competition with a growing field that includes large firms like Adobe and Google. Unlike those companies and others, the e-retail giant is at present focusing exclusively on enterprise customers. Amazon Bedrock is a managed service giving developers access to a range of foundation models from companies including Meta Platforms, Anthropic, and Amazon itself. Continue reading Amazon Previews Titan Image Generator for Bedrock Clients
By
Paula ParisiNovember 20, 2023
Having made the leap from image generation to video generation over the course of a few months in 2022, Meta Platforms introduces Emu, its first visual foundational model, along with Emu Video and Emu Edit, positioned as milestones in the trek to AI moviemaking. Emu uses just two diffusion models to generate 512×512 four-second long videos at 16 frames per second, Meta said, comparing that to 2022’s Make-A-Video, which requires a “cascade” of five models. Internal research found Emu video generations were “strongly preferred” over the Make-A-Video model based on quality (96 percent) and prompt fidelity (85 percent). Continue reading Meta Touts Its Emu Foundational Model for Video and Editing