By
Paula ParisiOctober 14, 2022
Microsoft announced it is integrating OpenAI’s DALL-E 2 into its new Microsoft Designer app, as well as its Microsoft Edge browser and the Image Creator tool in its Bing search engine. Microsoft provides cloud computing services to OpenAI and has partnered with OpenAI in AI commercialization efforts including the Azure OpenAI Service, now in preview, and GitHub Copilot. The Designer web app can be used to create designs for posters, presentations, invitations and other graphics that can be printed and used for display or shared on social or business media. Continue reading Microsoft Integrates DALL-E 2 into Designer and Creator Apps
By
Paula ParisiOctober 10, 2022
AI image generators like OpenAI’s DALL-E 2 and Google’s Imagen have been generating a lot of attention recently. Now AI text-to-video generators are edging into the spotlight, with Google debuting Imagen Video on the heels of Meta AI’s Make-A-Video rollout last month. Imagen Video has been used to generate videos of up to 25-minutes at a 24 fps, 1280×768 pixel spec. Imagen Video was trained “on a combination of an internal dataset consisting of 14 million video-text pairs and 60 million image-text pairs,” resulting in some unusual functionality, according to Google Research. Continue reading Google and Meta Are Developing AI Text-to-Video Generators
By
Paula ParisiSeptember 26, 2022
OpenAI has released a new open source AI speech recognition model called Whisper that can recognize and translate audio at levels it says compare in accuracy and robustness to human abilities. Case uses include transcription of speeches, interviews, podcasts and conversations. “Moreover, it enables transcription in multiple languages, as well as translation from those languages into English,” says OpenAI, which is open-sourcing models and inference code on GitHub “to serve as a foundation for building useful applications and for further research on robust speech processing.” Continue reading OpenAI Rolls Out Open-Source Speech Recognition System
By
Paula ParisiSeptember 21, 2022
OpenAI has begun allowing users of its DALL-E 2 image-generating system to work with facial image uploads. The program previously allowed only computer-generated faces in an effort to prevent deepfakes and misuse, but OpenAI says improvements to its safety system succeeded in “minimizing the potential of harm” from things like explicit, political or violent content. OpenAI will continue to prohibit use of unauthorized photos and will seek to protect right of publicity, though it remains to be seen how effective that will be. In the past, customers have complained the company was overzealous in its policing. Continue reading OpenAI Expands DALL-E 2 Functionality with Facial Uploads
By
Paula ParisiAugust 26, 2022
Virtual character developer platform Inworld AI has raised $50 million in a Series A funding round led by Section 32 and Intel Capital. The Mountain View-based startup — one of six companies chosen to participate in the 2022 Disney Accelerator — will create virtual characters for games, the metaverse and other entertainment and marketing applications. Because it is focused on providing an interior life, or “mind,” Inworld AI is platform agnostic, with APIs that work across Unity, Unreal Engine, Omniverse and others. Another convenient feature: it lets developers build characters by describing them in natural language. Continue reading Inworld Raises $50M to Create AI-Powered Virtual Characters
By
Paula ParisiAugust 18, 2022
Stability AI is in the first stage of release of Stable Diffusion, a text-to-image generator similar in functionality to OpenAI’s DALL-E 2, with one important distinction: this open-source newcomer lacks the filters that prevent the earlier system from creating images of public figures or content deemed excessively toxic. Last week the Stable Diffusion code was made available to just over a thousand researchers and the Los Altos-based startup anticipates a public release in the coming weeks. The unfettered unleashing of a powerful imaging system has stirred controversy in the AI community, raising ethical questions. Continue reading Stability AI Releases Stable Diffusion Text-to-Image Generator
By
Paula ParisiAugust 12, 2022
OpenAI’s powerful text-to-image generator DALL-E 2 is still in beta, but businesses are already testing it for commercial use. Apparel firm Stitch Fix has been using it to visualize fabric and color personalization, while Heinz tapped the AI system for a marketing campaign. Cosmopolitan used it to design a magazine cover. Others have leveraged the image engine to generate logos and thumbnails. These early adopters are identifying technical issues that OpenAI says it is addressing as it readies DALL-E 2 for enterprise. Foremost among the complaints is the lack of a dedicated API for public use. Continue reading Businesses Experiment with DALL-E 2, Report Mixed Results
By
Paula ParisiAugust 2, 2022
Nvidia has issued a software update for its formidable NeMo Megatron giant language training model, increasing efficiency and speed. Barely a year since Nvidia unveiled Megatron, this latest improvement further leverages the transformer engine architecture that has become synonymous with deep learning since Google introduced the concept in 2017. New features result in what Nvidia says is a 5x reduction in memory requirements and up to a 30 percent gain in speed for models as large as 1 trillion parameters, making NeMo Megatron better at handling transformer tasks across the entire stack. Continue reading Nvidia Turbo Charges NeMo Megatron Large Training Model
By
Paula ParisiJuly 26, 2022
OpenAI is expanding its beta outreach for DALL-E 2 by inviting an additional one million waitlisted people to join the AI imaging platform over the coming weeks. DALL-E users will receive 50 credits during their first month of use and 15 credits every subsequent month, with each credit redeemable for an original DALL-E-prompted generation (returning four images) or an edit or variation prompt (which returns three images). Additional credits may be purchased in 115-generation increments for $15. Starting this month, users get rights to commercialize their DALL-E images. However, the move highlights the legal implications of AI and possible copyright infringement. Continue reading Legal Questions Loom as OpenAI Widens Access to DALL-E
By
Paula ParisiJune 28, 2022
New AI-powered coding tools such as Amazon’s CodeWhisperer and Copilot from GitHub and OpenAI may be giving some developers the jitters. Following splashy debuts for both programs last week, GitHub CEO Thomas Dohmke offered public assurances that Copilot is not designed to replace coders, but to speed the process, alleviating a software developer shortage. Similar to Copilot, CodeWhisperer can autocomplete Java, JavaScript and Python functions based on a comment or some keystrokes. Amazon says it trained the system using billions of lines of open source code, publicly available documentation and its own codebase. Continue reading AI Coding Tools Speed Process to Offset Developer Shortage
By
Paula ParisiJune 16, 2022
Adobe is releasing an open source developer toolkit that aims to prevent the spread of visual misinformation by including additional metadata that Adobe calls Content Credentials. The system is also designed to help content creators indelibly tag authorship to their work. Announced in 2019, the Content Authenticity Initiative (CAI) project has released a whitepaper introducing the system, which is integrated into Adobe software. The CAI has teamed with hardware manufacturers and newsrooms to help ubiquitize its vision. The Associated Press, The New York Times and The Wall Street Journal have signed aboard. Continue reading Adobe Debuts ‘Content Credentials’ to Battle Misinformation
By
Paula ParisiMay 27, 2022
Microsoft is previewing its express design in Power Apps, which can instantly generate low-code apps directly from design files and images. In a few clicks, anyone can now create web and mobile apps from inputs including paper forms, PDFs, sketches on the whiteboard or even assets designed in professional programs like Figma. As part of the Microsoft Power Platform, Power Apps uses advanced AI to accelerate design. “We’re particularly excited about our integration with Figma, the collaborative design platform, where so much software is designed today,” said Microsoft vice president of Power Apps Ryan Cunningham. Continue reading AI-Driven Microsoft Power Apps Offers Development Shortcuts
By
Paula ParisiMay 25, 2022
Google has released a research paper on a new text-to-image generator called Imagen, which combines the power of large transformer language models for text with the capabilities of diffusion models in high-fidelity image generation. “Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis,” the company said. Simultaneously, Google is introducing DrawBench, a benchmark for text-to-image models it says was used to compare Imagen with other recent technologies including VQGAN+CLIP, latent diffusion models, and OpenAI’s DALL-E 2. Continue reading Google’s Imagen AI Model Makes Advances in Text-to-Image
By
Paula ParisiApril 8, 2022
OpenAI has created a new technology that creates and edits images based on written descriptions of the desired result. DALL-E 2, an homage to the surrealist painter Salvador Dalí and the Pixar film “Wall-E,” is still in development but is already producing impressive results with simple instructions like “kittens playing chess” and “astronaut riding a horse.” OpenAI says the tech, “isn’t being directly released to the public” and the hope is “to later make it available for use in third-party apps. “Already some are expressing worry that such a tool has potential to exponentially increase the use of deepfakes. Continue reading DALL-E 2 by OpenAI Creates Images Based on Descriptions
By
Paula ParisiMarch 24, 2022
Nvidia CEO Jensen Huang announced a host of new AI tech geared toward data centers at the GTC 2022 conference this week. Available in Q3, the H100 Tensor Core GPUs are built on the company’s new Hopper GPU architecture. Huang described the H100 as the next “engine of the world’s AI infrastructures.” Hopper debuts in Nvidia DGX H100 systems designed for enterprise. With data centers, “companies are manufacturing intelligence and operating giant AI factories,” Huang said, speaking from a real-time virtual environment in the firm’s Omniverse 3D simulation platform. Continue reading Nvidia Introduces New Architecture to Power AI Data Centers