Nvidia's AI Blueprint Develops Agents to Analyze Visual Data

Nvidia’s AI Blueprint Develops Agents to Analyze Visual Data

By Paula Parisi
November 6, 2024

Nvidia’s growing AI arsenal now includes video search and summarization tool AI Blueprint, which helps developers build visual AI agents that analyze video and image content. The agents can answer user questions, generate summaries and even enable alerts for specific scenarios. The new feature is part of Metropolis, Nvidia’s developer toolkit for building computer vision applications using generative AI. Globally, enterprises and public organizations increasingly rely on visual information. Cameras, IoT sensors and autonomous vehicles are ingesting visual data at high rates, and visual agents can help monitor and make sense of that workflow.

“Announced ahead of the Smart City Expo World Congress, the Nvidia AI Blueprint gives visual computing developers a full suite of optimized software for building and deploying generative AI-powered agents that can ingest and understand massive volumes of live video streams or data archives,” VentureBeat reports.

“Users can customize these visual AI agents with natural language prompts instead of rigid software code, lowering the barrier to deploying virtual assistants across industries and smart city applications,” adds VB.

Visual AI agents are a function of vision language models (VLMs), a type of generative AI model that combine computer vision with language to understand the physical world and perform reasoning tasks based on visual data.

AI Blueprint can be configured with the company’s NIM microservices for VLMs like Nvidia VILA, LLMs like Meta’s Llama 3.1 405B and AI models “for GPU-accelerated question answering and context-aware retrieval-augmented generation,” Nvidia explains in a blog post. “Developers can easily swap in other VLMs, LLMs and graph databases and fine-tune them using the Nvidia NeMo platform for their unique environments and use cases.”

Whereas traditional video analytics apps have tended to rely on “fixed-function models with limited scope, primarily detecting predefined objects,” AI Blueprint “introduces a new era of video analytics” by leveraging vision models that provide “broader perception and richer contextual understanding,” Blockchain.News writes, amplifying on an Nvidia explainer.

Nvidia identifies use-cases like monitoring warehouses for safety infractions, surveying public infrastructure using aerial footage to identify maintenance needs, and generating reports of traffic collisions or other crisis events to aid in emergency response.

Companies including Accenture, Dell Technologies and Lenovo are among the early adopters using AI Blueprint “to bring visual search and summarization to businesses and cities worldwide, jump-starting the next wave of AI applications that can be deployed to boost productivity and safety,” according to Nvidia.

Nvidia’s AI Blueprint Develops Agents to Analyze Visual Data

No Comments Yet

Leave a comment