Large Language Model Archives - Page 2 of 2

Apple’s ReALM AI Advances the Science of Digital Assistants

By ETCentric Staff
April 4, 2024

Apple has developed a large language model it says has advanced screen-reading and comprehension capabilities. ReALM (Reference Resolution as Language Modeling) is artificial intelligence that can see and read computer screens in context, according to Apple, which says it advances technology essential for a true AI assistant “that aims to allow a user to naturally communicate their requirements to an agent, or to have a conversation with it.” Apple claims that in a benchmark against GPT-3.5 and GPT-4, the smallest ReALM model performed “comparable” to GPT-4, with its “larger models substantially outperforming it.” Continue reading Apple’s ReALM AI Advances the Science of Digital Assistants

Meta Building Giant AI Model to Power Entire Video Ecosystem

By ETCentric Staff
March 8, 2024

Facebook chief Tom Alison says parent company Meta Platforms is building a giant AI model that will eventually “power our entire video ecosystem.” Speaking at the Morgan Stanley Technology, Media & Telecom Conference this week, Alison said the model will drive the company’s video recommendation engine across all platforms that host long-form video as well as the short-form Reels, which are limited to 90 seconds. Alison said the company began experimenting with the new, super-sized AI model last year and found that it helped improve Facebook’s Reels watch time by anywhere from 8-10 percent. Continue reading Meta Building Giant AI Model to Power Entire Video Ecosystem

Amazon Claims ’Emergent Abilities’ for Text-to-Speech Model

By ETCentric Staff
February 21, 2024

Researchers at Amazon have trained what they are calling the largest text-to-speech model ever created, which they claim is exhibiting “emergent” qualities — the ability to inherently improve itself at speaking complex sentences naturally. Called BASE TTS, for Big Adaptive Streamable TTS with Emergent abilities, the new model could pave the way for more human-like interactions with AI, reports suggest. Trained on 100,000 hours of public domain speech data, BASE TTS offers “state-of-the-art naturalness” in English as well as some German, Dutch and Spanish. Text-to-speech models are used in developing voice assistants for smart devices and apps and accessibility. Continue reading Amazon Claims ’Emergent Abilities’ for Text-to-Speech Model

Apple’s Keyframer AI Tool Uses LLMs to Prototype Animation

By ETCentric Staff
February 16, 2024

Apple has taken a novel approach to animation with Keyframer, using large language models to add motion to static images through natural language prompts. “The application of LLMs to animation is underexplored,” Apple researchers say in a paper that describes Keyframer as an “animation prototyping tool.” Based on input from animators and engineers, Keyframer lets users refine their work through “a combination of prompting and direct editing,” the paper explains. The LLM can generate CSS animation code. Users can also use natural language to request design variations. Continue reading Apple’s Keyframer AI Tool Uses LLMs to Prototype Animation

Apple Launches Open-Source Language-Based Image Editor

By ETCentric Staff
February 9, 2024

Apple has released MGIE, an open-source AI model that edits images using natural language instructions. MGIE, short for MLLM-Guided Image Editing, can also modify and optimize images. Developed in conjunction with University of California Santa Barbara, MGIE is Apple’s first AI model. The multimodal MGIE, which understands text and image input, also crops, resizes, flips, and adds filters based on text instructions using what Apple says is an easier instruction set than other AI editing programs, and is simpler and faster than learning a traditional program, like Apple’s own Final Cut Pro. Continue reading Apple Launches Open-Source Language-Based Image Editor

CES: Rabbit Launches AI-Powered Pocket Controller for Apps

By Paula Parisi
January 12, 2024

Santa Monica-based AI startup Rabbit Inc. is offering a virtual assistant in the form of a pocket device that the company says can improve upon mobile phones by learning to use your apps and running them for you. Heavily publicized at CES 2024 in Las Vegas this week, the initial run of the company’s r-1 units had as of Tuesday sold out at $199 each. The retro-looking device with a 2.88-inch touchscreen is continuing to take preorders; shipments are scheduled to begin in late March. The company says its proprietary Rabbit OS is the first operating system built on a Large Action Model (LAM) foundation. LAMs are LLMs trained on datasets of actions and consequences. Continue reading CES: Rabbit Launches AI-Powered Pocket Controller for Apps

Google’s NotebookLM is a Personalized Lite Language Model

By Paula Parisi
December 12, 2023

Google personalized AI assistant NotebookLM is an experimental product that has been in early access since July. Now the company is integrating its new Gemini Pro LLM with NotebookLM and making it available to U.S. residents 18 and older. NotebookLM is engineered “to help you do your best thinking,” Google says, with documents uploaded to the service making it “an instant expert in the information you need,” allowing it to answer questions about your data. Unlike generic chatbots, NotebookLM draws responses from the documents you feed it, meaning it will be hyper-focused — a lite version of a custom trained model. Continue reading Google’s NotebookLM is a Personalized Lite Language Model

Startup Flip AI Creates Custom LLM to Address Observability

By Paula Parisi
November 10, 2023

Startup Flip AI has built a custom LLM to run its observability platform. Observability is the act of monitoring corporate IT systems, ferreting out issues or identifying potential problems before they occur. It’s a 24/7 process, and can slow down sites or apps, sometimes causing crashes. Not to be confused with the PDF reader app, Flip AI has trained an LLM specifically to monitor new and emerging challenges. Concurrently, Flip AI has announced $6.5 million in seed funding led by Factory with participation from Morgan Stanley Next Level Fund and GTM Capital. Continue reading Startup Flip AI Creates Custom LLM to Address Observability

OpenAI Intros GPT-4 Turbo, Creator Chatbots at Dev Confab

By Paula Parisi
November 8, 2023

Now anyone can make their own GPT chatbot, for fun or productivity — no coding skills necessary — and soon will be able to list it on a marketplace called the GPT Store. This was among the news announcements to come out of OpenAI’s first developer conference — OpenAI DevDay in San Francisco — where a new, lower-priced model called GPT-4 Turbo with 128K context, was unveiled, along with a new Assistants API, GPT-4 Turbo with Vision and the DALL-E 3 API. Now in preview, GPT-4 Turbo “is more capable and has knowledge of world events up to April 2023,” according to OpenAI. Continue reading OpenAI Intros GPT-4 Turbo, Creator Chatbots at Dev Confab

Woodpecker: Chinese Researchers Combat AI Hallucinations

By Paula Parisi
October 27, 2023

The University of Science and Technology of China (USTC) and Tencent YouTu Lab have released a research paper on a new framework called Woodpecker, designed to correct hallucinations in multimodal large language AI models. “Hallucination is a big shadow hanging over the rapidly evolving MLLMs,” writes the group, describing the phenomenon as when MLLMs “output descriptions that are inconsistent with the input image.” Solutions to date focus mainly on “instruction-tuning,” a form of retraining that is data and computation intensive. Woodpecker takes a training-free approach that purports to correct hallucinations from the basis of the generated text. Continue reading Woodpecker: Chinese Researchers Combat AI Hallucinations

ChatGPT Goes Multimodal: OpenAI Adds Vision, Voice Ability

By Paula Parisi
October 11, 2023

OpenAI began previewing vision capabilities for GPT-4 in March, and the company is now starting to roll out the image input and output to users of its popular ChatGPT. The multimodal expansion also includes audio functionality, with OpenAI proclaiming late last month that “ChatGPT can now see, hear and speak.” The upgrade vaults GPT-4 into the multimodal category with what OpenAI is apparently calling GPT-4V (for “Vision,” though equally applicable to “Voice”). “We’re rolling out voice and images in ChatGPT to Plus and Enterprise users,” OpenAI announced. Continue reading ChatGPT Goes Multimodal: OpenAI Adds Vision, Voice Ability

Amazon Plans to Invest Up to $4 Billion in AI Startup Anthropic

By Paula Parisi
September 26, 2023

Amazon has entered into a strategic investment in San Francisco-based Anthropic, founded by former members of OpenAI. The AI startup will train and deploy future models using AWS Trainium and Inferentia chips to train and deploy future foundation models with AWS as its primary cloud provider. In turn, Amazon says it will invest up to $4 billion in Anthropic, as it strives to compete with other technology firms in the race to develop generative AI, seeding growth for what is shaping up to be an entirely new economic and social landscape. Continue reading Amazon Plans to Invest Up to $4 Billion in AI Startup Anthropic

Intuit’s GenOS Spawns Its First Customer AI Product: ‘Assist’

By Paula Parisi
September 11, 2023

Financial software giant Intuit is adding a customer-facing AI assistant to work with individuals and small businesses. Intuit Assist is being integrated across Intuit products starting with TurboTax and expanding to QuickBooks, Credit Karma and Mailchimp. Assist will be embedded across Intuit’s products via a common user interface, allowing customers to get personalized recommendations via contextual datasets. The generative AI assistant was built using Intuit’s Generative AI Operating System, a proprietary corporate model dubbed GenOS, launched in June. Intuit is working with OpenAI to accelerate GenAI app development on GenOS. Continue reading Intuit’s GenOS Spawns Its First Customer AI Product: ‘Assist’