An Introduction to Model Merging for LLMs
One challenge organizations face when customizing large language models (LLMs) is the need to run multiple experiments, which produces only one useful model….
One challenge organizations face when customizing large language models (LLMs) is the need to run multiple experiments, which produces only one useful model. While the cost of experimentation is typically low, and the results well worth the effort, this experimentation process does involve “wasted” resources, such as compute assets spent without their product being utilized…
Learn how to build scalable data processing pipelines to create high-quality datasets.
Learn how to build scalable data processing pipelines to create high-quality datasets.
Deploying large language models (LLMs) in production environments often requires making hard trade-offs between enhancing user interactivity and increasing…
Deploying large language models (LLMs) in production environments often requires making hard trade-offs between enhancing user interactivity and increasing system throughput. While enhancing user interactivity requires minimizing time to first token (TTFT), increasing throughput requires increasing tokens per second. Improving one aspect often results in the decline of the other…
Fraud in financial services is a massive problem. According to NASDAQ, in 2023, banks faced $442 billion in projected losses from payments, checks, and credit…
Fraud in financial services is a massive problem. According to NASDAQ, in 2023, banks faced $442 billion in projected losses from payments, checks, and credit card fraud. It’s not just about the money, though. Fraud can tarnish a company’s reputation and frustrate customers when legitimate purchases are blocked. This is called a false positive. Unfortunately, these errors happen more often than…
The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system…
The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system architectures. Our work at NVIDIA using AI for internal operations has led to several important findings for finding alignment between system capabilities and user expectations. We found that regardless of the intended scope or use case…
NVIDIA today announced that xAI’s Colossus supercomputer cluster comprising 100,000 NVIDIA Hopper Tensor Core GPUs in Memphis, Tennessee, achieved this massive scale by using the NVIDIA Spectrum-X™ Ethernet networking platform, which is designed to deliver superior performance to multi-tenant, hyperscale AI factories using standards-based Ethernet, for its Remote Direct Memory Access (RDMA) network.
Financial losses from worldwide credit card transaction fraud are expected to reach $43 billion by 2026. A new NVIDIA AI workflow for fraud detection running on Amazon Web Services (AWS) can help combat this burgeoning epidemic — using accelerated data processing and advanced algorithms to improve AI’s ability to detect and prevent credit card transaction
Read Article
An overwhelming 91% of financial services industry (FSI) companies are either assessing artificial intelligence or already have it in the bag as a tool that’s driving innovation, improving operational efficiency and enhancing customer experiences. Generative AI — powered by NVIDIA NIM microservices and accelerated computing — can help organizations improve portfolio optimization, fraud detection, customer
Read Article