By
Paula ParisiMarch 27, 2025
OpenAI has activated the multimodal image generation capabilities of GPT-4o, making it available to ChatGPT users on the Plus, Pro, Team and Free tiers. It replaces DALL-E 3 as the default image generator for the popular chatbot. GPT-4o’s accuracy with text, understanding of symbols and precision with prompts combined with well multimodal capabilities that allow the model to take cues from visual material have transformed its image capabilities from largely unpredictable to “consistent and context-aware,” resulting in “a practical tool with precision and power,” claims OpenAI. Continue reading OpenAI Delivers Native GPT-4o Image Generator to ChatGPT
By
Paula ParisiMarch 25, 2025
Google has added a Canvas feature to its Gemini AI chatbot that provides users with a real-time collaborative space where writing and coding projects can be refined and other ideas iterated and shared. “Canvas is designed for seamless collaboration with Gemini,” according to Gemini Product Director Dave Citron, who notes that Canvas makes it “an even more effective collaborator” in helping bring ideas to life. The move marks a trend whereby AI companies are trying to turn chatbot platforms into turnkey productivity suites. Google is launching a limited release of Gemini Live Video in addition to bringing its Audio Overview feature of NotebookLM to Gemini. Continue reading Canvas and Live Video Add Productivity Features to Gemini AI
By
Paula ParisiMarch 25, 2025
Anthropic’s Claude can now search the Internet in real time, allowing it to provide timely and relevant responses that are also more accurate than what the chatbot previously offered, according to the company. Claude incorporates direct citations for its Web-retrieved material, so users can fact-check its sources. “Instead of finding search results yourself, Claude processes and delivers relevant sources in a conversational format.” While this is not exactly groundbreaking — ChatGPT, Grok 3, Copilot, Perplexity and Gemini all have real-time Web retrieval and most include citations — Claude takes a slightly different approach. Continue reading Real-Time Web Access Informs Claude 3.7 Sonnet Responses
By
Paula ParisiFebruary 4, 2025
ChatGPT has a new “deep research” agent that OpenAI says uses reasoning to synthesize large amounts of online information and complete multi-step research tasks. “It accomplishes in tens of minutes what would take a human many hours,” OpenAI suggests, claiming it will “synthesize hundreds of online sources to create a comprehensive report at the level of a research analyst.” Powered by a version of the upcoming OpenAI o3 model optimized for web browsing and data analysis, the company says the deep research agent will typically take 5 to 30 minutes to complete its work. The agent is described as an ideal research tool for areas such as finance, science and engineering. Continue reading ChatGPT ‘Deep Research’ Agent Can Create Detailed Reports
By
George GerbaJanuary 10, 2025
During CES this week, Sony demonstrated a proof-of-concept experience based on the popular HBO post-apocalyptic drama “The Last of Us.” We were dropped into a six-person pod of newly enlisted defenders and assigned to a hardened defender who needed new recruits to combat a serious surge of zombie assaults that she was convinced could be overcome with our assistance. Armed with LED-enabled shotgun-like devices and tracked flashlights to assist our leader in discovering the concealed attackers, our combat leader led us with sharp and direct commands as she guided us through the terrors of the attack. Continue reading CES: Sony Introduces Interactive Experience – ‘The Last of Us’
By
Douglas ChanJanuary 8, 2025
Nvidia founder and CEO Jensen Huang kicked off CES 2025 with a keynote that was filled with new product announcements and visionary demonstrations of how the company plans to advance the field of AI. The first product that Huang unveiled was the GeForce RTX 50 series of consumer graphics processing units (GPUs). The series is also called RTX Blackwell because it is based on Nvidia’s latest Blackwell microarchitecture design for next generation data center and gaming applications. To showcase RTX Blackwell’s prowess, Huang played an impressively photorealistic video sequence of rich imagery under contrasting light ranges — all rendered in real time. Continue reading CES: Nvidia Unveils New GeForce RTX 50, AI Video Rendering
By
Paula ParisiDecember 4, 2024
Artificial voice startup Hume AI has had a busy Q4, introducing Voice Control, a no-code artificial speech interface that gives users control over 10 voice dimensions ranging from “assertiveness” to “buoyancy” and “nasality.” The company also debuted an interface that “creates emotionally intelligent voice interactions” with Anthropic’s foundation model Claude that has prompted one observer to ponder the possibility that keyboards will become a thing of the past when it comes to controlling computers. Both advances expand on Hume’s work with its own foundation model, Empathic Voice Interface 2 (EVI 2), which adds emotional timbre to AI voices. Continue reading Hume AI Introduces Voice Control and Claude Interoperability
By
Paula ParisiNovember 5, 2024
D-ID has launched two new types of AI-powered avatars: Premium+ and Express. The company’s video-to-video avatar tools aim to provide personal look-alikes that can sub for their creators in uses ranging from instructional videos to business presentations, offloading on-camera duties in areas including sales, marketing and customer support. “Premium+ Avatars can generate hyper-realistic digital humans that are indistinguishable from real people and will serve as the foundation for fully interactive digital agents revolutionizing how brands communicate,” while Express Avatars can rapidly generate serviceable avatars “from just one minute of source footage.” Continue reading D-ID’s New Business-Use Avatars Can Converse in Real Time
By
Paula ParisiJune 25, 2024
OpenAI has acquired Rockset, a database firm that provides real-time analytics, indexing and search capabilities. Rockset will help OpenAI enable its customers to better leverage their own data as they build and utilize intelligent applications. Rockset technology will be integrated into the retrieval infrastructure across OpenAI products, with members of Rockset’s San Mateo, California-based team joining the staff of OpenAI, which is headquartered in San Francisco. This is the second major purchase for OpenAI, following last year’s acquisition of New York-based AI design studio Global Illumination. Financial terms of the deal were not disclosed. Continue reading OpenAI to Expand Data Indexing, Analysis with Rockset Tech
By
Paula ParisiJune 24, 2024
Snap Inc. teased a new on-device AI model capable of real-time filter creation in-app using Snapchat. At last week’s Augmented World Expo in Long Beach, California, Snap co-founder and CTO Bobby Murphy explained that the model, which runs on smartphones, can re-render frames on the fly guided by text prompts. Snap’s unnamed prototype model “can instantly bring your imagination to life in AR,” Snap says, explaining “this early prototype makes it possible to type in an idea for a transformation and generate vivid AR experiences in real time.” Continue reading Snapchat Previews Instant AR Filters, GenAI Developer Tools
By
Paula ParisiDecember 5, 2023
The research division of Meta AI has developed Seamless Communication, a suite of artificial intelligence models that generate what the company says is natural and authentic communication across languages, facilitating what amounts to real-time universal speech translation. The models were released with accompanying research papers and data. The flagship model, Seamless, merges capabilities from a trio of models — SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 — into a single system that can translate between almost 100 spoken and written languages, preserving idioms, emotion and the speaker’s vocal style, Meta says. Continue reading Meta AI Seamless Translator Converts Nearly 100 Languages
By
Paul BennunDecember 4, 2023
Stability AI, developer of Stable Diffusion (one of the leading visual content generators, alongside Midjourney and DALL-E), has introduced SDXL Turbo — a new AI model that demonstrates more of the latent possibilities of the common diffusion generation approach: images that update in real time as the user’s prompt updates. This feature was always a possibility even with previous diffusion models given text and images are comprehended differently across linear time, but increased efficiency of generation algorithms and the steady accretion of GPUs and TPUs in a developer’s data center makes the experience more magical. Continue reading Stability AI Intros Real-Time Text-to-Image Generation Model
By
Rob ScottDecember 19, 2022
Facing backlash against his executive leadership, Twitter’s new owner and CEO, billionaire Elon Musk, conducted an informal 12-hour poll over the weekend asking users of the popular social media platform whether he should keep his new position. “Should I step down as head of Twitter?” the controversial executive asked. “I will abide by the results of this poll.” After more than 17.5 million responses, the results indicate that a majority of users believe Musk should step down from his post (57.5 percent voted in the affirmative). As of press time, it remains unclear what action Musk may take in light of the poll results. Continue reading Twitter Users Vote in Favor of Musk Stepping Down as CEO
By
Debra KaufmanSeptember 16, 2020
Facebook launched Watch Together, a feature for Messenger and videoconferencing platform Messenger Rooms, to allow users to watch videos in real time with family and friends on Apple and Android mobile devices. Users choose videos to view through Facebook’s video hub, Facebook Watch. The push to promote yet more video comes at a time when, due largely to COVID-19, more people than ever are at home watching content. Facebook Messenger allows up to eight people on a video call, and Messenger Rooms tops out at 50 people. Continue reading Facebook Rolls Out New Messenger Feature, Watch Together
By
Debra KaufmanJanuary 21, 2020
AI can enable many important tasks from manufacturing to medicine, but only if the applications are speedy and secure. Communication via the cloud adds latency and risks privacy, which is why Google worked on a solution — dubbed Coral — that avoids centralized data centers. Coral product manager Vikram Tank described Coral as a “platform of [Google] hardware and software components … that help you build devices with local AI — providing hardware acceleration for neural networks … right on the edge device.” Continue reading Google Bypasses Cloud to Offer AI to Enterprise Customers