Categories
Misc

A New ERA of AI Factories: NVIDIA Unveils Enterprise Reference Architectures

As the world transitions from general-purpose to accelerated computing, finding a path to building data center infrastructure at scale is becoming more important than ever. Enterprises must navigate uncharted waters when designing and deploying infrastructure to support these new AI workloads. Constant developments in model capabilities and software frameworks, along with the novelty of these
Read Article

Categories
Misc

Enhanced Security and Streamlined Deployment of AI Agents with NVIDIA AI Enterprise

NVIDIA AI Enterprise use cases as cards on a black background, with the logo in front.AI agents are emerging as the newest way for organizations to increase efficiency, improve productivity, and accelerate innovation. These agents are more…NVIDIA AI Enterprise use cases as cards on a black background, with the logo in front.

AI agents are emerging as the newest way for organizations to increase efficiency, improve productivity, and accelerate innovation. These agents are more advanced than prior AI applications, with the ability to autonomously reason through tasks, call out to other tools, and incorporate both enterprise data and employee knowledge to produce valuable business outcomes. They’re being embedded into…

Source

Categories
Misc

Enhanced Security and Streamlined Deployment of AI Agents with NVIDIA AI Enterprise

NVIDIA AI Enterprise use cases as cards on a black background, with the logo in front.AI agents are emerging as the newest way for organizations to increase efficiency, improve productivity, and accelerate innovation. These agents are more…NVIDIA AI Enterprise use cases as cards on a black background, with the logo in front.

AI agents are emerging as the newest way for organizations to increase efficiency, improve productivity, and accelerate innovation. These agents are more advanced than prior AI applications, with the ability to autonomously reason through tasks, call out to other tools, and incorporate both enterprise data and employee knowledge to produce valuable business outcomes. They’re being embedded into…

Source

Categories
Misc

Universal Assisted Generation: Faster Decoding with Any Assistant Model

Categories
Misc

An Introduction to Model Merging for LLMs

One challenge organizations face when customizing large language models (LLMs) is the need to run multiple experiments, which produces only one useful model….

One challenge organizations face when customizing large language models (LLMs) is the need to run multiple experiments, which produces only one useful model. While the cost of experimentation is typically low, and the results well worth the effort, this experimentation process does involve “wasted” resources, such as compute assets spent without their product being utilized…

Source

Categories
Misc

Upcoming Webinar: Enhance Generative AI Model Accuracy Through High-Quality Data Processing

Learn how to build scalable data processing pipelines to create high-quality datasets.

Learn how to build scalable data processing pipelines to create high-quality datasets.

Source

Categories
Misc

NVIDIA GH200 Superchip Accelerates Inference by 2x in Multiturn Interactions with Llama Models

Deploying large language models (LLMs) in production environments often requires making hard trade-offs between enhancing user interactivity and increasing…

Deploying large language models (LLMs) in production environments often requires making hard trade-offs between enhancing user interactivity and increasing system throughput. While enhancing user interactivity requires minimizing time to first token (TTFT), increasing throughput requires increasing tokens per second. Improving one aspect often results in the decline of the other…

Source

Categories
Misc

Supercharging Fraud Detection in Financial Services with Graph Neural Networks

Fraud in financial services is a massive problem. According to NASDAQ, in 2023, banks faced $442 billion in projected losses from payments, checks, and credit…

Fraud in financial services is a massive problem. According to NASDAQ, in 2023, banks faced $442 billion in projected losses from payments, checks, and credit card fraud. It’s not just about the money, though. Fraud can tarnish a company’s reputation and frustrate customers when legitimate purchases are blocked. This is called a false positive. Unfortunately, these errors happen more often than…

Source

Categories
Misc

Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA

GIF shows chat app in use.The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system…GIF shows chat app in use.

The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system architectures. Our work at NVIDIA using AI for internal operations has led to several important findings for finding alignment between system capabilities and user expectations. We found that regardless of the intended scope or use case…

Source

Categories
Misc

Expert Support case study: Bolstering a RAG app with LLM-as-a-Judge