By
Paula ParisiDecember 8, 2023
Google is closing the year by heralding 2024 as the “Gemini era,” with the introduction of its “most capable and general AI model yet,” Gemini 1.0. This new foundation model is optimized for three different use-case sizes: Ultra, Pro and Nano. As a result, Google is releasing a new, Gemini-powered version of its Bard chatbot, available to English speakers in the U.S. and 170 global regions. Google touts Gemini as built from the ground up for multimodality, reasoning across text, images, video, audio and code. However, Bard will not as yet incorporate Gemini’s ability to analyze sound and images. Continue reading Google Announces the Launch of Gemini, Its Largest AI Model
By
Paula ParisiDecember 6, 2023
Netflix has completed a worldwide technology upgrade that improves video quality for Premium subscribers viewing 4K HDR titles. The move is being hailed as welcome news in the wake of a price hike to $22.99 from $19.99 for U.S. Premium customers. Netflix used the “dynamic optimization” video encoding method to implement an HDR variant of the company’s VMAF (Video Multimethod Assessment Fusion) quality metric. The new HDR-VMAF is the result of a collaboration between Netflix and Dolby Laboratories that employs “subjective tests with 4K HDR content using high-end OLED panels,” according to Netflix. Continue reading Netflix Uses Deep Learning to Optimize Streaming in 4K HDR
By
Paula ParisiDecember 6, 2023
The Google and Nvidia-backed AI video startup Runway is partnering with Getty Images to develop Runway Getty Images Model (RGM), which it is positioning as a new type of generative AI model capable of “providing a new way to bring ideas and stories to life through video” for enterprise customers using copyright compliant means. Targeting Hollywood studios, advertising, media and broadcast clients, RGM will “provide a baseline model upon which companies can build their own custom models for the generation of video content,” Runway explains. Continue reading Runway Teams with Getty on AI Video for Hollywood and Ads
By
Paul BennunDecember 4, 2023
Stability AI, developer of Stable Diffusion (one of the leading visual content generators, alongside Midjourney and DALL-E), has introduced SDXL Turbo — a new AI model that demonstrates more of the latent possibilities of the common diffusion generation approach: images that update in real time as the user’s prompt updates. This feature was always a possibility even with previous diffusion models given text and images are comprehended differently across linear time, but increased efficiency of generation algorithms and the steady accretion of GPUs and TPUs in a developer’s data center makes the experience more magical. Continue reading Stability AI Intros Real-Time Text-to-Image Generation Model
By
Paula ParisiDecember 4, 2023
TikTok is further entrenching itself in music streaming with its new “Artist Account” toolbox that allows followers to play tunes, buy merchandise and access news and information. The new feature is designed to allow artists to promote their work and increase discoverability using such tools as an artist tag and new release highlights on discovery pages. TikTok describes the Artist Account as a “toolbox of features” to support promotion and “forge a closer relationship between artist and fan on TikTok.” Available to every music creator that has uploaded at least four songs, more than 70,000 accounts are eligible from the drop. Continue reading TikTok Adds ‘Artist Accounts’ to Help Boost Music Streaming
By
Paul BennunNovember 28, 2023
Bill Gates has published his thinking about the future of computing, and fascinatingly, it’s the same as his prediction from decades ago: agents. No mere bots — and certainly not anthropomorphized paperclips — agents (to Gates) will abstract almost all HCI to a natural language conversation with systems that have our permission to take meaningful actions. Gates makes a highly specific prediction: within five years, the very idea of an app itself will seem as outdated as a rotary phone dial does next to an iPhone. A conversational UI will sit on top of a language model that has access to as much of our private data as we wish to give it. Continue reading Bill Gates Imagines Agents as the Human-Computer Interface
By
Paula ParisiNovember 28, 2023
Google’s Bard AI chatbot is getting smarter regarding video queries. Specifically, a new YouTube extension is now able to answer questions about the content of individual videos without requiring playback. “We’re expanding the YouTube extension to understand some video content so you can have a richer conversation with Bard about it,” Google wrote on Bard’s changelog. In September, Google released a YouTube extension that made it easier to find specific videos. This update allows Bard to operate more interactively, sharing detailed information as it relates to YouTube’s visual content. Continue reading Google’s Bard AI Is Getting Smarter About YouTube Content
By
Paula ParisiNovember 27, 2023
Stability AI has opened research preview on its first foundation model for generative video, Stable Video Diffusion, offering text-to-video and image-to-video. Based on the company’s Stable Diffusion text-to-image model, the new open-source model generates video by animating existing still frames, including “multi-view synthesis.” While the company plans to enhance and extend the model’s capabilities, it currently comes in two versions: SVD, which transforms stills into 576×1024 videos of 14 frames, and SVD-XT that generates up to 24 frames — each at between three and 30 frames per second. Continue reading Stability Introduces GenAI Video Model: Stable Video Diffusion
By
Paula ParisiNovember 22, 2023
Adobe has unveiled Project Sound Lift, an AI-powered technology that separates speech recordings into discrete tracks of voices, non-speech sounds and other background noise in video. The company describes Project Sound Lift as “a one-click solution” that leverages AI to help users easily manipulate audio recordings “across a range of scenarios” to “enhance, transform, and control speech and sound independently.” Adobe’s existing Enhance Speech technology, available in the company’s Premiere Pro editing program, has been integrated within Project Sound Lift to aid creators in producing studio-quality audio content. Continue reading Adobe Reveals Its New AI Tool for Editing Problematic Audio
By
Paula ParisiNovember 20, 2023
Having made the leap from image generation to video generation over the course of a few months in 2022, Meta Platforms introduces Emu, its first visual foundational model, along with Emu Video and Emu Edit, positioned as milestones in the trek to AI moviemaking. Emu uses just two diffusion models to generate 512×512 four-second long videos at 16 frames per second, Meta said, comparing that to 2022’s Make-A-Video, which requires a “cascade” of five models. Internal research found Emu video generations were “strongly preferred” over the Make-A-Video model based on quality (96 percent) and prompt fidelity (85 percent). Continue reading Meta Touts Its Emu Foundational Model for Video and Editing
By
Paula ParisiNovember 17, 2023
More U.S. adults say they regularly get news from TikTok, according to a Pew Research study that says this bucks the general trend of news consumption declining or remaining flat at other social media sites over the past few years. Since 2020, regular TikTok news consumption among American adults has more than quadrupled to 14 percent, from 3 percent, Pew finds. Among younger adults, news consumption is even higher, with 32 percent of those ages 18 to 29 claiming to regularly get news on TikTok. This compares with 15 percent of those 30 to 49. Continue reading TikTok on the Rise as News Source, Facebook and X Decline
By
Paula ParisiNovember 15, 2023
Sweden-based digital music service Spotify has redesigned its television user interface to make it look and feel more like the platform’s desktop and mobile experiences, a move its users have been waiting for. The frequently played content now appears at the top of the interface, while playlists and favorites are also easily accessed. A playback queue that opens from the side of the screen lets you see what’s playing and program what’s coming up, and you can easily switch accounts by toggling profile options. A dark mode option has been added for the television interface as well. Continue reading Spotify Redesigns TV Interface to Better Match Its Mobile App
By
Paula ParisiNovember 15, 2023
Threads, the Twitter competitor launched in July by Meta Platforms to record-breaking numbers, has added features that make it easier for users to separate their Threads feeds from Instagram and Facebook. Users can now delete their Threads accounts separate from Instagram, something that previously confounded users. Because those signing up for Threads were required to do so either from their existing or a new Instagram account, the two were entwined. Instagram/Threads CEO Adam Mosseri also announced that propagation of Threads posts to Instagram and Facebook can now be turned off, to keep discussions separate. Continue reading Threads Lets Users Delete Accounts Separate from Instagram
By
Paula ParisiNovember 15, 2023
Meta Platforms-owned instant messaging and VoIP service WhatsApp has updated its Voice Chat feature for mobile so it can now host group calls of up to 128 participants. Voice chats allow WhatsApp users to instantly talk live with members of a group chat while still being able to message within the group. The new feature, which is being compared to a Discord server, is being rolled out globally. The idea is to have the Voice Chat be less disruptive than group calling, which rings-in all group members. Voice chats can be quietly started with an in-chat bubble users tap to join. The updated version will have end-to-end encryption by default. Continue reading Meta’s WhatsApp Launches Voice Chat for Up to 128 People
By
Paula ParisiNovember 9, 2023
YouTube is testing artificial intelligence features with YouTube viewing Premium customers and the long-form video experience. Paid subscribers on mobile platforms who opt-in can participate in the two experimental tests for AI-assisted functions that include help with the comments section and a chatbot tool that lets viewers get answers to questions about videos and recommendations for related content. To help viewers “dive deeper into content” without interrupting playback, a conversational “Ask” icon will appear beneath videos in progress, inviting questions or letting viewers select suggested prompts. Responses include answers and related content recommendations. Continue reading YouTube Premium Testing GenAI Tools for Long-Form Video