Categories
Misc

VMware and NVIDIA Unlock Generative AI for Enterprises

VMware Inc. and NVIDIA today announced the expansion of their strategic partnership to ready the hundreds of thousands of enterprises that run on VMware’s cloud infrastructure for the era of generative AI.

Categories
Misc

NVIDIA AI-Ready Servers From World’s Leading System Manufacturers to Supercharge Generative AI for Enterprises

Servers Featuring NVIDIA L40S GPUs and NVIDIA BlueField Coming Soon From Dell Technologies, Hewlett Packard Enterprise and Lenovo to Support VMware Private AI Foundation with NVIDIALAS VEGAS, …

Categories
Misc

NVIDIA Debuts AI-Enhanced Real-Time Ray Tracing for Games and Apps With New DLSS 3.5

The latest advancements in AI for gaming are in the spotlight today at Gamescom, the world’s largest gaming conference, as NVIDIA introduced a host of technologies, starting with DLSS 3.5, the next step forward of its breakthrough AI neural rendering technology. DLSS 3.5, NVIDIA’s latest innovation in AI-powered graphics is an image quality upgrade incorporated Read article >

Categories
Misc

Coming This Fall: NVIDIA DLSS 3.5 for Chaos Vantage, D5 Render, Omniverse and Popular Game Titles

On the eve of Gamescom, NVIDIA announced NVIDIA DLSS 3.5 featuring Ray Reconstruction — a new neural rendering AI model that creates more beautiful and realistic ray-traced visuals than traditional rendering methods — for real-time 3D creative apps and games.

Categories
Misc

NVIDIA Chief Scientist Bill Dally to Keynote at Hot Chips

Bill Dally — one of the world’s foremost computer scientists and head of NVIDIA’s research efforts — will describe the forces driving accelerated computing and AI in his keynote address at Hot Chips, an annual gathering of leading processor and system architects. Dally will detail advances in GPU silicon, systems and software that are delivering Read article >

Categories
Misc

NVIDIA Chief Scientist Bill Dally to Keynote at Hot Chips

Bill Dally — one of the world’s foremost computer scientists and head of NVIDIA’s research efforts — will describe the forces driving accelerated computing and AI in his keynote address at Hot Chips, an annual gathering of leading processor and system architects. Dally will detail advances in GPU silicon, systems and software that are delivering Read article >

Categories
Misc

Event: Speech AI Day

Speech AI Day promo asset, showing an illustration of several people from different countries standing around a globe saying hello in their language.On Sept. 20, join experts from leading companies at NVIDIA-hosted Speech AI Day.Speech AI Day promo asset, showing an illustration of several people from different countries standing around a globe saying hello in their language.

On Sept. 20, join experts from leading companies at NVIDIA-hosted Speech AI Day.

Categories
Offsites

Google at Interspeech 2023

This week, the 24th Annual Conference of the International Speech Communication Association (INTERSPEECH 2023) is being held in Dublin, Ireland, representing one of the world’s most extensive conferences on research and technology of spoken language understanding and processing. Experts in speech-related research fields gather to take part in oral presentations and poster sessions and to build collaborations across the globe.

We are excited to be a Platinum Sponsor of INTERSPEECH 2023, where we will be showcasing more than 20 research publications and supporting a number of workshops and special sessions. We welcome in-person attendees to drop by the Google Research booth to meet our researchers and participate in Q&As and demonstrations of some of our latest speech technologies, which help to improve accessibility and provide convenience in communication for billions of users. In addition, online attendees are encouraged to visit our virtual booth in Topia where you can get up-to-date information on research and opportunities at Google. Visit the @GoogleAI Twitter account to find out about Google booth activities (e.g., demos and Q&A sessions). You can also learn more about the Google research being presented at INTERSPEECH 2023 below (Google affiliations in bold).

Board and Organizing Committee

ISCA Board, Technical Committee Chair: Bhuvana Ramabhadran

Area Chairs include:
    Analysis of Speech and Audio Signals: Richard Rose

    Speech Synthesis and Spoken Language Generation: Rob Clark

    Special Areas: Tara Sainath

Satellite events

VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23)

Organizers include: Arsha Nagrani

ISCA Speech Synthesis Workshop (SSW12)

Speakers include: Rob Clark

Keynote talk – ISCA Medalist

Bridging Speech Science and Technology — Now and Into the Future

Speaker: Shrikanth Narayanan

Survey Talk

Speech Compression in the AI Era

Speaker: Jan Skoglund

Special session papers

Cascaded Encoders for Fine-Tuning ASR Models on Overlapped Speech
Richard Rose, Oscar Chang, Olivier Siohan

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Hakan Erdogan, Scott Wisdom, Xuankai Chang*, Zalán Borsos, Marco Tagliasacchi, Neil Zeghidour, John R. Hershey

Papers

DeePMOS: Deep Posterior Mean-Opinion-Score of Speech
Xinyu Liang, Fredrik Cumlin, Christian Schüldt, Saikat Chatterjee

O-1: Self-Training with Oracle and 1-Best Hypothesis
Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Kartik Audhkhasi

Re-investigating the Efficient Transfer Learning of Speech Foundation Model Using Feature Fusion Methods
Zhouyuan Huo, Khe Chai Sim, Dongseong Hwang, Tsendsuren Munkhdalai, Tara N. Sainath, Pedro Moreno

MOS vs. AB: Evaluating Text-to-Speech Systems Reliably Using Clustered Standard Errors
Joshua Camp, Tom Kenter, Lev Finkelstein, Rob Clark

LanSER: Language-Model Supported Speech Emotion Recognition
Taesik Gong, Josh Belanich, Krishna Somandepalli, Arsha Nagrani, Brian Eoff, Brendan Jou

Modular Domain Adaptation for Conformer-Based Streaming ASR
Qiujia Li, Bo Li, Dongseong Hwang, Tara N. Sainath, Pedro M. Mengibar

On Training a Neural Residual Acoustic Echo Suppressor for Improved ASR
Sankaran Panchapagesan, Turaj Zakizadeh Shabestary, Arun Narayanan

MD3: The Multi-dialect Dataset of Dialogues
Jacob Eisenstein, Vinodkumar Prabhakaran, Clara Rivera, Dorottya Demszky, Devyani Sharma

Dual-Mode NAM: Effective Top-K Context Injection for End-to-End ASR
Zelin Wu, Tsendsuren Munkhdalai, Pat Rondon, Golan Pundak, Khe Chai Sim, Christopher Li

Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran

How to Estimate Model Transferability of Pre-trained Speech Models?
Zih-Ching Chen, Chao-Han Huck Yang*, Bo Li, Yu Zhang, Nanxin Chen, Shuo-yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath

Improving Joint Speech-Text Representations Without Alignment
Cal Peyser, Zhong Meng, Ke Hu, Rohit Prabhavalkar, Andrew Rosenberg, Tara N. Sainath, Michael Picheny, Kyunghyun Cho

Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Shaan Bijwadia, Shuo-yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang, Tara N. Sainath

Streaming Parrotron for On-Device Speech-to-Speech Conversion
Oleg Rybakov, Fadi Biadsy, Xia Zhang, Liyang Jiang, Phoenix Meadowlark, Shivani Agrawal

Semantic Segmentation with Bidirectional Language Models Improves Long-Form ASR
W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-yiin Chang, Tara N. Sainath

Universal Automatic Phonetic Transcription into the International Phonetic Alphabet
Chihiro Taguchi, Yusuke Sakai, Parisa Haghani, David Chiang

Mixture-of-Expert Conformer for Streaming Multilingual ASR
Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Francoise Beaufays

Real Time Spectrogram Inversion on Mobile Phone
Oleg Rybakov, Marco Tagliasacchi, Yunpeng Li, Liyang Jiang, Xia Zhang, Fadi Biadsy

2-Bit Conformer Quantization for Automatic Speech Recognition
Oleg Rybakov, Phoenix Meadowlark, Shaojin Ding, David Qiu, Jian Li, David Rim, Yanzhang He

LibriTTS-R: A Restored Multi-speaker Text-to-Speech Corpus
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna

PronScribe: Highly Accurate Multimodal Phonemic Transcription from Speech and Text
Yang Yu, Matthew Perez*, Ankur Bapna, Fadi Haik, Siamak Tazari, Yu Zhang

Label Aware Speech Representation Learning for Language Identification
Shikhar Vashishth, Shikhar Bharadwaj, Sriram Ganapathy, Ankur Bapna, Min Ma, Wei Han, Vera Axelrod, Partha Talukdar


* Work done while at Google

Categories
Misc

The Proof Is in the Cloud: GeForce NOW Announces Ultimate KovaaK’s Challenge Results

The verdict is in: A GeForce NOW Ultimate membership raises the bar on gaming. Members have been tackling the Ultimate KovvaK’s challenge head-on and seeing for themselves how the power of Ultimate improves their gaming with 240 frames per second streaming. The popular training title that helps gamers improve their aim fully launches in the Read article >

Categories
Offsites

Wanna help judge SoME3?