Categories
Misc

Think SMART: How to Optimize AI Factory Inference Performance

From AI assistants doing deep research to autonomous vehicles making split-second navigation decisions, AI adoption is exploding across industries. Behind every one of those interactions is inference — the stage after training where an AI model processes inputs and produces outputs in real time. Today’s most advanced AI reasoning models — capable of multistep logic
Read Article

Categories
Misc

Scaling AI Inference Performance and Flexibility with NVIDIA NVLink and NVLink Fusion

The exponential growth in AI model complexity has driven parameter counts from millions to trillions, requiring unprecedented computational resources that…

The exponential growth in AI model complexity has driven parameter counts from millions to trillions, requiring unprecedented computational resources that require clusters of GPUs to accommodate. The adoption of mixture-of-experts (MoE) architectures and AI reasoning with test-time scaling increases compute demands even more. To efficiently deploy inference, AI systems have evolved toward large…

Source

Categories
Misc

GeForce NOW Brings RTX 5080 Power to the Ultimate Membership

Get a glimpse into the future of gaming. The NVIDIA Blackwell RTX architecture is coming to GeForce NOW in September, marking the service’s biggest upgrade yet. Turn any device into a powerhouse gaming rig with GeForce RTX 5080-class performance, next-generation AI features and a major leap forward in stunning cinematic visuals — all without raising
Read Article

Categories
Misc

Reinforcement Learning with NVIDIA NeMo-RL: Megatron-Core Support for Optimized Training Throughput

The initial release of NVIDIA NeMo-RL included training support through PyTorch DTensor (otherwise known as FSDP2). This backend enables native integration with…

The initial release of NVIDIA NeMo-RL included training support through PyTorch DTensor (otherwise known as FSDP2). This backend enables native integration with the HuggingFace ecosystem, quick experimentation, and scaling with PyTorch native parallelisms (FSDP2, tensor parallel, sequence parallel, and context parallel). However, when model sizes approach hundreds of billions of parameters…

Source

Categories
Misc

Into the Omniverse: How OpenUSD and Digital Twins Are Powering Industrial and Physical AI

Editor’s note: This blog is a part of Into the Omniverse, a series focused on how developers, 3D practitioners and enterprises can transform their workflows using the latest advances in OpenUSD and NVIDIA Omniverse. Investments in industrial AI and physical AI are driving increased demand for digital twins across industries. These physically accurate, virtual replicas
Read Article

Categories
Misc

Deploying Your Omniverse Kit Apps at Scale

Running 3D applications that take advantage of advanced rendering and simulation technologies often requires users to navigate complex installs and have access…

Running 3D applications that take advantage of advanced rendering and simulation technologies often requires users to navigate complex installs and have access to advanced infrastructure. NVIDIA Omniverse Kit App Streaming helps developers to reduce this friction, enabling them to deploy and stream their applications built with NVIDIA Omniverse SDKs and libraries directly to a browser. Whether you’…

Source

Categories
Misc

New Nemotron Nano 2 Open Reasoning Model Tops Leaderboard and Delivers 6x Higher Throughput

There’s a new leaderboard-topping NVIDIA Nemotron Nano 2 model. It’s an open model with leading accuracy and up to 6x higher throughput compared to the next…

There’s a new leaderboard-topping NVIDIA Nemotron Nano 2 model. It’s an open model with leading accuracy and up to 6x higher throughput compared to the next best open model in the 8B size category.

Source

Categories
Misc

Generate Images with Claude and Hugging Face

Categories
Misc

Scaling AI Factories with Co-Packaged Optics for Better Power Efficiency

As artificial intelligence redefines the computing landscape, the network has become the critical backbone shaping the data center of the future. Large language…

As artificial intelligence redefines the computing landscape, the network has become the critical backbone shaping the data center of the future. Large language model training performance is determined not only by compute resources but by the agility, capacity, and intelligence of the underlying network. The industry is witnessing the evolution from traditional, CPU-centric infrastructures toward…

Source

Categories
Misc

New Lightweight AI Model for Project G-Assist Brings Support for 6GB NVIDIA GeForce RTX and RTX PRO GPUs

At Gamescom, NVIDIA is releasing its first major update to Project G‑Assist — an experimental on-device AI assistant that allows users to tune their NVIDIA RTX systems with voice and text commands. The update brings a new AI model that uses 40% less VRAM, improves tool-calling intelligence and extends G-Assist support to all RTX GPUs
Read Article