Categories
Misc

Advanced NVIDIA CUDA Kernel Optimization Techniques: Handwritten PTX

As accelerated computing continues to drive application performance in all areas of AI and scientific computing, there’s a renewed interest in GPU optimization…

As accelerated computing continues to drive application performance in all areas of AI and scientific computing, there’s a renewed interest in GPU optimization techniques to ensure applications obtain the best possible performance. As an application developer, there are many ways to program GPUs, up and down the software stack. In this post, we introduce some of the different levels of the stack…

Source

Categories
Misc

NVIDIA Omniverse: What Developers Need to Know About Migration Away From Launcher

As part of continued efforts to ensure NVIDIA Omniverse is a developer-first platform, NVIDIA will be deprecating the Omniverse Launcher on Oct. 1. Doing so…

As part of continued efforts to ensure NVIDIA Omniverse is a developer-first platform, NVIDIA will be deprecating the Omniverse Launcher on Oct. 1. Doing so will enable a more open, integrated, and efficient development experience. Removing the Launcher will streamline how developers access essential tools and resources on the platforms they already use and trust.

Source

Categories
Misc

NVIDIA RTX AI Accelerates FLUX.1 Kontext — Now Available for Download

Black Forest Labs, one of the world’s leading AI research labs, just changed the game for image generation. The lab’s FLUX.1 image models have earned global attention for delivering high-quality visuals with exceptional prompt adherence. Now, with its new FLUX.1 Kontext model, the lab is fundamentally changing how users can guide and refine the image
Read Article

Categories
Misc

Optimizing FLUX.1 Kontext for Image Editing with Low-Precision Quantization

FLUX.1 Kontext, the recently released model from Black Forest Labs, is a fascinating addition to the repertoire of community image generation models. The open…

FLUX.1 Kontext, the recently released model from Black Forest Labs, is a fascinating addition to the repertoire of community image generation models. The open weights FLUX.1 Kontext [dev] variant, the focus of this post, is a model meticulously optimized for image-to-image transformation tasks. This pioneering tool stands out for its incremental image editing capabilities…

Source

Categories
Misc

Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training

Decorative image.In this blog post, we’ll break down the main FP8 scaling strategies—per-tensor scaling, delayed and current scaling, and per-block scaling (including the…Decorative image.

In this blog post, we’ll break down the main FP8 scaling strategies—per-tensor scaling, delayed and current scaling, and per-block scaling (including the Blackwell-backed MXFP8 format)—and explain why each is essential for maintaining numerical stability and accuracy during low-precision training. Understanding these approaches will help with choosing the right recipe for your own FP8 workflows.

Source

Categories
Misc

How to Build Custom AI Agents with NVIDIA NeMo Agent Toolkit Open Source Library

AI agents are revolutionizing the digital workforce by transforming business operations, automating complex tasks, and unlocking new efficiencies. With the…

AI agents are revolutionizing the digital workforce by transforming business operations, automating complex tasks, and unlocking new efficiencies. With the ability to collaborate, these agents can now work together to tackle complex problems and drive even greater impact. The NVIDIA NeMo Agent toolkit is an open source library that simplifies the integration of agents…

Source

Categories
Misc

How AI Factories Can Help Relieve Grid Stress

In many parts of the world, including major technology hubs in the U.S., there’s a yearslong wait for AI factories to come online, pending the buildout of new energy infrastructure to power them. Emerald AI, a startup based in Washington, D.C., is developing an AI solution that could enable the next generation of data centers
Read Article

Categories
Misc

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

Categories
Misc

NVIDIA NeMo Retriever Scores First Place for Visual Retrieval

NeMo Retriever tops several visual document retrieval leaderboards, setting new standards for RAG apps.

NeMo Retriever tops several visual document retrieval leaderboards, setting new standards for RAG apps.

Source

Categories
Misc

Best-in-Class Multimodal RAG: How the Llama 3.2 NeMo Retriever Embedding Model Boosts Pipeline Accuracy

Data goes far beyond text—it is inherently multimodal, encompassing images, video, audio, and more, often in complex and unstructured formats. While the…

Data goes far beyond text—it is inherently multimodal, encompassing images, video, audio, and more, often in complex and unstructured formats. While the common method is to convert PDFs, scanned images, slides, and other documents into text, it is challenging to capture all information in text format, as shown in Figure 1. The loss of visual information in text motivated the development of…

Source