By
ETCentric StaffApril 2, 2024
OpenAI has debuted a new text-to-voice generation platform called Voice Engine, available in limited access. Voice Engine can generate a synthetic voice from a 15-second clip of someone’s voice. The synthetic voice can then read a provided text, even translating to other languages. For now, only a handful of companies are using the tech under a strict usage policy as OpenAI grapples with the potential for misuse. “These small scale deployments are helping to inform our approach, safeguards, and thinking about how Voice Engine could be used for good across various industries,” OpenAI explained. Continue reading OpenAI Voice Cloning Tool Needs Only a 15-Second Sample
By
ETCentric StaffMarch 29, 2024
Databricks, a San Francisco-based company focused on cloud data and artificial intelligence, has released a generative AI model called DBRX that it says sets new standards for performance and efficiency in the open source category. The mixture-of-experts (MoE) architecture contains 132 billion parameters and was pre-trained on 12T tokens of text and code data. Databricks says it provides the open community and enterprises who want to build their own LLMs with capabilities previously limited to closed model APIs. Compared to other open models, Databricks claims it outperforms alternatives including Llama 2-70B and Mixtral on certain benchmarks. Continue reading Databricks DBRX Model Offers High Performance at Low Cost
By
ETCentric StaffMarch 6, 2024
Anthropic has released Claude 3, claiming new industry benchmarks that see the family of three new large language models approaching “near-human” cognitive capability in some instances. Accessible via Anthropic’s website, the three new models — Claude 3 Haiku, Claude 3 Sonnet and Claude 3 Opus — represent successively increased complexity and parameter count. Sonnet is powering the current Claude.ai chatbot and is free, for now, requiring only an email sign-in. Opus comes with the the $20 monthly subscription for Claude Pro. Both are generally available from the Anthropic website and via API in 159 countries, with Haiku coming soon. Continue reading Anthropic’s Claude 3 AI Is Said to Have ‘Near-Human’ Abilities
By
ETCentric StaffMarch 5, 2024
Paris-based startup Mistral AI has made an immediate splash in the world of artificial intelligence, securing partnerships with IBM, Microsoft and others nine months after its launch. The company is offering natural language processing models, including its flagship Mistral Large, which becomes only the second LLM (after OpenAI) to land a commercial berth on Microsoft’s Azure cloud, where Meta Platforms’ Llama 2 is available in preview. Boasting “top-tier reasoning capacities” and sophisticated conversational capabilities, Mistral Large specializes in “reasoning, analysis and generation (RAG), is multilingual and supports up to 32,000 tokens.” Continue reading France’s Mistral AI Makes Its Global Debut on Microsoft Azure
By
ETCentric StaffFebruary 16, 2024
Apple has taken a novel approach to animation with Keyframer, using large language models to add motion to static images through natural language prompts. “The application of LLMs to animation is underexplored,” Apple researchers say in a paper that describes Keyframer as an “animation prototyping tool.” Based on input from animators and engineers, Keyframer lets users refine their work through “a combination of prompting and direct editing,” the paper explains. The LLM can generate CSS animation code. Users can also use natural language to request design variations. Continue reading Apple’s Keyframer AI Tool Uses LLMs to Prototype Animation
By
Rob ScottJanuary 3, 2024
Apple recently announced advances in artificial intelligence research that could introduce more immersive visual experiences and enable sophisticated AI systems to run on the company’s popular mobile devices. Two new research papers highlight techniques for creating 3D avatars from video content and efficiently deploying large language models on devices challenged by limited memory. The real-time ability to create avatars and 3D scenes from an iPhone camera could bring a range of new possibilities for CE devices in areas such as synthetic media, telepresence, social interaction, virtual try-on and more. Continue reading Apple Unveils New Advances in Artificial Intelligence Research
By
Paula ParisiDecember 20, 2023
Microsoft has expanded its Models as a Service (MaaS) catalog for Azure AI Studio, building beyond the 40 models announced at the Microsoft Ignite event last month with the addition of the Llama 2 code generation model from Meta Platforms in public preview. In addition, GPT-4 Turbo with Vision has been added to accelerate generative AI and multimodal application development. Similar to things like Software as a Service (SaaS) and Infrastructure as a Service (IaaS), MaaS lets customers use AI models on-demand over the web with easy setup and technical support. Continue reading Microsoft Brings Meta’s Llama 2 to Azure Models as a Service
By
Paula ParisiDecember 12, 2023
The EU has reached a provisional agreement on the Artificial Intelligence Act, making it the first Western democracy to establish comprehensive AI regulations. The sweeping new law predominantly focuses on so-called “high-risk AI,” establishing parameters — largely in the form of reporting and third-party monitoring — “based on its potential risks and level of impact.” Parliament and the 27-country European Council must still hold final votes before the AI Act is finalized and goes into effect, but the agreement, reached Friday in Brussels after three days of negotiations, means the main points are set. Continue reading EU Makes Provisional Agreement on Artificial Intelligence Act
By
Paula ParisiDecember 8, 2023
Google is closing the year by heralding 2024 as the “Gemini era,” with the introduction of its “most capable and general AI model yet,” Gemini 1.0. This new foundation model is optimized for three different use-case sizes: Ultra, Pro and Nano. As a result, Google is releasing a new, Gemini-powered version of its Bard chatbot, available to English speakers in the U.S. and 170 global regions. Google touts Gemini as built from the ground up for multimodality, reasoning across text, images, video, audio and code. However, Bard will not as yet incorporate Gemini’s ability to analyze sound and images. Continue reading Google Announces the Launch of Gemini, Its Largest AI Model
By
Paula ParisiNovember 22, 2023
GitHub Copilot Chat for enterprise becomes generally available in December, and the GitHub site is integrating the artificial intelligence assistant across its entire platform, promising that AI will infuse every step of the developer lifecycle. “Just as GitHub was founded on Git, today we are re-founded on Copilot,” the Microsoft-owned company announced this month. Powered by OpenAI’s GPT-4, the new configuration will offer inline Copilot Chat for code questions, contextual guidance and “slash commands” for /fix and /test. The AI tool is designed to assist coders with their everyday workflows with a series of “one-click” assists and other shortcuts. Continue reading GitHub Copilot Brings AI-Powered Coding Tool to Enterprise
By
Paula ParisiNovember 7, 2023
Elon Musk’s startup xAI has unveiled its first product, a large language model with chatbot capabilities named Grok, currently available via an early access waitlist with plans to go wide to Premium+ subscribers to the X social platform (formerly Twitter) following beta tests. The company says Grok has “access to search tools and real-time information” and is extremely up-to-date, but “as with all the LLMs trained on next-token prediction, our model can still generate false or contradictory information.” The chatbot is distinguished by sarcasm and wit, “so please don’t use it if you hate humor,” xAI warns. Continue reading Elon Musk’s xAI Rolling Out ‘Grok’ LLM in Early Access Beta
By
Paula ParisiNovember 2, 2023
LinkedIn expects to pass the 1 billion user mark this month, and timed to that feat is unleashing a new suite of AI productivity tools, including job coaching, personalized digests and help writing original content for the platform. The new machine learning assists will initially be available only to Premium subscribers, centered on the aforementioned three main areas. The move follows months in which LinkedIn has been upgrading its AI capabilities in areas like automated recruiter messaging, job descriptions and profile writing suggestions. The improvements draw on OpenAI technology, in which LinkedIn parent Microsoft has an ownership stake. Continue reading Nearing 1 Billion Users, LinkedIn Debuts Job Coach Chatbot
By
Paula ParisiNovember 2, 2023
Social question and answer platform Quora has inserted itself on the leading edge of companies helping creators monetize AI chatbots. Quora’s AI chatbot platform Poe will pay those who create prompt bots on Poe as well as developers of server bots that integrate with the Poe API. “Since this is the beginning of a new market, there are lots of opportunities to provide a valuable service for the world and make money at the same time,” said Quora CEO Adam D’Angelo, envisioning a thriving bot economy across categories from tutoring and therapy to storytelling and roleplay. Continue reading Quora Plans to Foster Chatbot Creator Economy with Poe AI
By
Paula ParisiOctober 25, 2023
Nvidia Research has debuted Eureka, an AI agent that autonomously teaches robots complex motor skills. Powered by OpenAI’s GPT-4, Eureka has successfully trained a robotic hand to handle a pen with the dexterity of a human — a first, according to Nvidia. Eureka has also enabled robots to do things like open drawers, manipulate scissors and toss and catch balls, along with dozens of other tasks. “Eureka is a first step toward developing new algorithms that integrate generative and reinforcement learning methods to solve hard tasks,” according to Nvidia Senior Director of AI Research Anima Anandkumar said. Continue reading Nvidia Leverages OpenAI’s GPT-4 to Train Dexterous Robots
By
Paula ParisiOctober 11, 2023
OpenAI began previewing vision capabilities for GPT-4 in March, and the company is now starting to roll out the image input and output to users of its popular ChatGPT. The multimodal expansion also includes audio functionality, with OpenAI proclaiming late last month that “ChatGPT can now see, hear and speak.” The upgrade vaults GPT-4 into the multimodal category with what OpenAI is apparently calling GPT-4V (for “Vision,” though equally applicable to “Voice”). “We’re rolling out voice and images in ChatGPT to Plus and Enterprise users,” OpenAI announced. Continue reading ChatGPT Goes Multimodal: OpenAI Adds Vision, Voice Ability