Categories
Misc

NVIDIA Research: An Unbiased Ray-Marching Transmittance Estimator

NVIDIA researchers will present their paper “An Unbiased Ray-Marching Transmittance Estimator” at SIGGRAPH 2021, August 9-13, showing a new way to compute visibility in scenes with complex volumetric effects.

NVIDIA researchers will present their paper “An Unbiased Ray-Marching Transmittance Estimator” at SIGGRAPH 2021, August 9-13, showing a new way to compute visibility in scenes with complex volumetric effects.

We present new methods for reducing a common source of noise in scenes with volumetric effects.

When Monte Carlo sampling is used to compute lighting effects such as soft shadows and global illumination, shadow rays are used to query the visibility between lights and surfaces.  In many scenes, visibility is a simple binary answer that is efficiently queried using the RTX RT Cores.  However, for volumetric effects such as clouds, smoke and explosions, visibility is a fractional quantity ranging from 0.0 to 1.0, and computing this quantity efficiently and accurately between any two points in a scene is an essential part of photorealistic real-time rendering.  Visibility that accounts for volumetric effects is also called transmittance.

We present a new algorithm, which we call unbiased ray marching, for evaluating this transmittance in general scenes. Our key result is a new statistically-unbiased Monte Carlo method that randomly queries the volume densities along a ray and computes a visibility estimate from these values.  For high-performance rendering, the goal is to compute an accurate low-noise estimate while minimizing the number of times the volume data needs to be  accessed.  

The research paper provides a new efficiency analysis of the transmittance estimation problem. This yields a number of key insights about when and how various transmittance estimators achieve the optimal balance of low noise and low cost.  Building on these insights, and leveraging previous work from graphics, physics and statistics, we derive a new transmittance estimator that is universally optimal relative to 50 years of prior art, and often ten times more efficient than earlier methods.  Technically, the end result is based on a power-series expansion of the exponential function that appears in the transmittance equation, and 90% of the time our new estimator simply evaluates the first term in this expansion. This first term corresponds to the traditional biased ray-marching algorithm. The subsequent terms of the expansion, which our estimator evaluates 10% of the time, correct for the bias.  This key insight allows us to benefit from the efficiency of ray marching without being plagued by its bias – occasional light-leaking artifacts seen as, for example, overly bright clouds.

The new method helps to reduce the noise of shadows, such as the one cast on the floor by the smoke plume in the above figure where we see a dramatic reduction in noise at the same render time.

Putting a new twist on an old method, unbiased ray marching virtually eliminates a common source of noise in path-traced renderings of scenes with rich volumetric effects.

Learn more: Check out the project website.

Categories
Misc

ICYMI: NVIDIA Jetson for Robot Operating System

In this month’s technical digest we’re highlighting this powerful capability and offer a collection of resources to help ROS users familiarize with the power of Jetson, ISAAC SDK, ISAAC Sim and a success story from NVIDIA Inception Member, Aerobotics.

While you are likely already familiar with Jetson as the NVIDIA AI platform for edge AI, you might be surprised to learn that the power of Jetson isn’t limited to solutions built entirely on Jetson platforms. Jetpack on Jetson and Isaac SDK, our robotics platform, adds AI capabilities to existing autonomous machines and bridges dependencies for diverse hardware and software solutions.

In this month’s technical digest, we’re highlighting this powerful capability and offering a collection of resources to help Robotic Operating System (ROS) users get familiar with the power of Jetson, Isaac SDK, Isaac Sim, and a success story on a startup in the NVIDIA Inception program.

Jetson and ROS tutorials

Accelerating AI Modules for ROS and ROS 2 on NVIDIA Jetson Platform
A showcase of support for ROS and ROS 2 on NVIDIA Jetson developer kits.
Read now >

Building Robotics Applications Using ROS and NVIDIA Isaac SDK
Learn how to use a ROS application stack with NVIDIA Isaac and ISAAC Sim.
Read now >  

Implementing Robotics Applications with ROS 2 and AI on the NVIDIA Jetson Platform
Learn how to perform classification, object detection and pose estimation with ROS 2 on Jetson.
Read now >

Top 5 Jetson resources

Meet Jetson, the platform for AI at the edge
NVIDIA Jetson is used by professional developers to create breakthrough AI products across all industries, and by students and enthusiasts for hands-on AI learning and making amazing projects.
Meet Jetson >

Getting started with Jetson Developer Kits
Jetson developer kits are used by professionals to develop and test software for products based on Jetson modules, and by students and enthusiasts for projects and learning. Each developer kit includes a non-production specification Jetson module attached to a reference carrier board with standard hardware interfaces for flexible development and rapid prototyping.
Learn more >

Free 8-hour class on getting started with AI on Jetson Nano
In this course, you’ll use Jupyter iPython notebooks on your own Jetson Nano to build a deep learning classification project with computer vision models. Upon completion, you receive a certificate.
Start learning hands-on today >

Jetson and Embedded Systems developer forums
The Jetson and Embedded Systems community is active, vibrant, and continually monitored by the Jetson Product and Engineering teams.
Meet the community >

Jetson community projects
Explore and learn from Jetson projects created by us and our community.
Explore community projects >

Recommended hardware

Jetson Nano 2GB
Most affordable
At 457 GFLOPs for $59, this developer kit is the ultimate starter AI computer.
Learn more >
Jetson Xavier NX
Great for multiple use cases
Run modern neural networks and advanced AI applications within the footprint of a credit card.
Learn more >
Partner solutions
Production-ready now
Our ecosystem partners can support subsystem designs all the way up to fully realized autonomous machines with their associated requirements.
Learn more >

Inception spotlight

Bilberry: Using Jetson for Computer Vision for Crop Surveying
In 2016, the former dormmates at École Nationale Supérieure d’Arts et Métiers in Paris, founded Bilberry. The startup developed a solution to recognize weeds powered by the NVIDIA Jetson edge AI platform for precision application of herbicides at corn and wheat farms, offering as much as a 92% reduction in herbicide usage.
Learn more >

Do you have a startup? Join NVIDIA Inception’s global network of over 7,500 AI and data science startups.

Categories
Misc

NVIDIA Announces Instant AI Infrastructure for Enterprises

Built with Enterprise IT Partners and Deployed First at Equinix, NVIDIA AI LaunchPad Includes End-to-End NVIDIA Hardware and Software Stack to Accelerate AI from Hybrid Cloud to EdgeSANTA …

Categories
Misc

NVIDIA Fleet Command Scales Edge AI Services for Enterprises

SaaS Platform Helps Leading Device Makers, Factories, Warehouses and Retailers Deploy AI Products and ServicesSANTA CLARA, Calif., June 22, 2021 (GLOBE NEWSWIRE) — NVIDIA today announced …

Categories
Misc

Tesla Unveils Top AV Training Supercomputer Powered by NVIDIA A100 GPUs

Tackling one of the largest computing challenges of this lifetime requires larger than life computing. At CVPR this week, Andrej Karpathy, senior director of AI at Tesla, unveiled the in-house supercomputer the automaker is using to train deep neural networks for Autopilot and self-driving capabilities. The cluster uses 720 nodes of 8x NVIDIA A100 Tensor Read article >

The post Tesla Unveils Top AV Training Supercomputer Powered by NVIDIA A100 GPUs appeared first on The Official NVIDIA Blog.

Categories
Misc

Can someone help me fix this error?

Traceback (most recent call last):

File “c:UsersLandenOneDriveDesktopSign Language DetectorRealTimeObjectDetectionTensorflowscriptsgenerate_tfrecord.py”, line 62, in <module> label_map_dict = label_map_util.get_label_map_dict(label_map) File

“C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagesobject_detectionutilslabel_map_util.py”, line 164, in get_label_map_dict label_map = load_labelmap(label_map_path) File

“C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagesobject_detectionutilslabel_map_util.py”, line 133, in load_labelmap label_map_string = fid.read() File “C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagestensorflowpythonlibiofile_io.py”, line 117, in read self._preread_check() File

“C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagestensorflowpythonlibiofile_io.py”, line 79, in _preread_check self._read_buf = _pywrap_file_io.BufferedInputStream( TypeError: __init__(): incompatible constructor arguments.

The following argument types are supported:

  1. tensorflow.python.lib.io._pywrap_file_io.BufferedInputStream(filename: str, buffer_size: int, token: tensorflow.python.lib.io._pywrap_file_io.TransactionToken = None)

    Invoked with: item { name: “Hello” id: 1 } item { name: “ILoveYou” id: 2 } item { name: “No” id: 3 } item { name: “Thanks” id: 4 } item { name: “Yes” id: 5 } , 524288

submitted by /u/Spud783
[visit reddit] [comments]

Categories
Misc

Federated Learning with Homomorphic Encryption

In NVIDIA Clara Train 4.0, we added homomorphic encryption (HE) tools for federated learning (FL). HE enables you to compute data while the data is still encrypted. In Clara Train 3.1, all clients used certified SSL channels to communicate their local model updates with the server. The SSL certificates are needed to establish trusted communication … Continued

In NVIDIA Clara Train 4.0, we added homomorphic encryption (HE) tools for federated learning (FL). HE enables you to compute data while the data is still encrypted.

In Clara Train 3.1, all clients used certified SSL channels to communicate their local model updates with the server. The SSL certificates are needed to establish trusted communication channels and are provided through a third party that runs the provisioning tool and securely distributes them to the hospitals. This secures the communication to the server, but the server can still see the raw model (unencrypted) updates to do aggregation.

With Clara Train 4.0, the communication channels are still established using SSL certificates and the provisioning tool. However, each client optionally also receives additional keys to homomorphically encrypt their model updates before sending them to the server. The server doesn’t own a key and only sees the encrypted model updates.

With HE, the server can aggregate these encrypted weights and then send the updated model back to the client. The clients can decrypt the model weights because they have the keys and can then continue with the next round of training (Figure 1).

Diagram shows HE in action. Two clients send encrypted values to the FL server to be added together. The result is then sent back to the client and decrypted.
Figure 1. Why homomorphic encryption?

HE ensures that each client’s changes to the global model stays hidden by preventing the server from reverse-engineering the submitted weights and discovering any training data. This added security comes at a computational cost on the server. However, it can play an important role in healthcare in making sure that patient data stays secure at each hospital while still benefiting from using federated learning with other institutions.

HE implementation in Clara Train 4.0

We implemented secure aggregation during FL with HE using the TenSEAL library by OpenMined, a convenient Python wrapper around Microsoft SEAL. Both libraries are available as open-source and provide an implementation of Homomorphic encryption for arithmetic of approximate numbers, also known as the CKKS scheme, which was proposed as a solution for encrypted machine learning.

Our default uses the following HE setting, specified in Clara Train’s provisioning tool for FL:

# homomorphic encryption
he:
  lib: tenseal
  config:
    poly_modulus_degree: 8192
    coeff_mod_bit_sizes: [60, 40, 40]
    scale_bits: 40
    scheme: CKKS 

These settings are recommended and should work for most tasks but could be further optimized depending on your specific model architecture and machine learning problem. For more information about different settings, see this tutorial on the CKKS scheme and benchmarking.

HE benchmarking in FL

To compare the impact of HE to the overall training time and performance, we ran the following experiments. We chose SegResNet (a U-Net like architecture used to win the BraTS 2018 challenge) trained on the CT spleen segmentation task from the Medical Segmentation Decathlon.

Each federated learning run was trained for 100 rounds with each client training for 10 local epochs on their local data on an NVIDIA V100 (server and clients are running on localhost). In each run, half of the clients each used half of the training data (16/16) and half of the validation data (5/4), respectively. We recorded the total training time and best average validation dice score of the global model. We show the relative increase added by HE in Table 1.

There is a moderate increase in total training time of about 20% when encrypting the full model. This increase in training time is due to the added encryption and decryption steps and aggregation in homomorphically encrypted space. Our implementation enables you to reduce that extra time by only encrypting a subset of the model parameters, for example, all convolutional layers (“conv”). You could also encrypt just three of the key layers, such as the input, middle, and output layers.

The added training time is also due to increased message sizes needed to send the encrypted model gradient updates, requiring longer upload times. For SegResNet, we observe an increase from 19 MB to 283 MB using the HE setting mentioned earlier (~15x increase).

Setting Message size (MB) Nr. clients Training time Best global Dice Best epoch Re. Increase in training time
Raw 19 2 4:57:26 0.951 931
Raw 19 4 5:11:20 0.956 931
Raw 19 8 5:18:00 0.943 901
HE full 283 2 5:57:05 0.949 931 20.1%
HE full 283 4 6:00:05 0.946 811 15.7%
HE full 283 8 6:21:56 0.963 971 20.1%
HE conv layers 272 2 5:54:39 0.952 891 19.2%
HE conv layers 272 4 6:06:13 0.954 951 17.6%
HE conv layers 272 8 6:28:16 0.948 891 22.1%
HE three layers 43 2 5:12:10 0.957 811 5.0%
HE three layers 43 4 5:15:01 0.939 841 1.2%
HE three layers 43 8 5:19:02 0.949 971 0.3%
Table 1. Relative increase in training time by running FL with secure aggregation using HE.

Next, we compare the performance of FL using up to 30 clients with the server running on AWS. For reference, we used an m5a.2xlarge with eight vCPUs, 32-GB memory, and up to 2,880 Gbps network bandwidth. We show the average encryption, decryption, and upload time, comparing raw compared to encrypted model gradients being uploaded in Figure 2 and Table 2. You can see the longer upload times due to the larger message sizes needed by HE.

A graph showing the average encryption, decryption, and upload time comparing raw compared to encrypted model gradients being uploaded.
Figure 2. Running federated learning with 30 clients and the server on AWS
Time in seconds Mean Std. Dev.
Encryption time 5.01 1.18
Decryption time 0.95 0.04
Enc. upload time 38 71.170
Raw upload time 21.57 74.23
Table 2. Federated learning using homomorphic encrypted compared to raw model updates.

Try it out

If you’re interested in learning more about how to set up FL with homomorphic encryption using Clara Train, we have a great Jupyter notebook on GitHub that walks you through the setup.

HE can reduce model inversion or data leakage risks if there is a malicious or compromised server. However, your final models might still contain or memorize privacy-relevant information. That’s where differential privacy methods can be a useful addition to HE. Clara Train SDK implements the sparse vector technique (SVT) and partial model sharing that can help preserve privacy. For more information, see Privacy-preserving Federated Brain Tumour Segmentation. Keep in mind that there is a tradeoff between model performance and privacy protection.

Categories
Misc

Newb needs best Machine Learning approach for Number Sequence

I have a sequence of 20 numbers, 1 – 20 and all numbers must be put in to sequence once. I have a formula which generates a score for each sequence input. What would be the best Machine Learning approach to find the sequence with the lowest score. It is very noisy and has many minimums.

Thanks for your help.

James

submitted by /u/JackFeet
[visit reddit] [comments]

Categories
Misc

AMD Graphics

I am trying to run a CNN model on a computer with an AMD Raedon 5500. I ran the previous models on computers with Nvidia cards and they worked but for some reason tensorflow cannot see the GPU. It says the list of physical devices is empty. I know that the GPU is in PCI slot 3 and not 1, would that be an issue?

submitted by /u/NameError-undefined
[visit reddit] [comments]

Categories
Offsites

Google at CVPR 2021

This week marks the start of the 2021 Conference on Computer Vision and Pattern Recognition (CVPR 2021), the premier annual computer vision event consisting of the main conference, workshops and tutorials. As a leader in computer vision research and a Champion Level Sponsor, Google will have a strong presence at CVPR 2021, with over 70 publications accepted, along with the organization of and participation in multiple workshops and tutorials.

If you are participating in CVPR this year, please visit our virtual booth to learn about Google research into the next generation of intelligent systems that utilize the latest machine learning techniques applied to various areas of machine perception.

You can also learn more about our research being presented at CVPR 2021 in the list below (Google affiliations in bold).

Organizing Committee Members

General Chair: Rahul Sukthankar
Finance Chair: Ramin Zabih
Workshop Chair: Caroline Pantofaru
Area Chairs: Chen Sun, Golnaz Ghiasi, Jonathan Barron, Kostas Rematas, Negar Rostamzadeh, Noah Snavely, Sanmi Koyejo, Tsung-Yi Lin

Publications

Cross-Modal Contrastive Learning for Text-to-Image Generation (see the blog post)
Han Zhang, Jing Yu Koh, Jason Baldridge, Honglak Lee*, Yinfei Yang

Learning Graph Embeddings for Compositional Zero-Shot Learning
Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata

SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans
Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Nießner

3D-MAN: 3D Multi-Frame Attention Network for Object Detection
Zetong Yang*, Yin Zhou, Zhifeng Chen, Jiquan Ngiam

MIST: Multiple Instance Spatial Transformer
Baptiste Angles, Yuhe Jin, Simon Kornblith, Andrea Tagliasacchi, Kwang Moo Yi

OCONet: Image Extrapolation by Object Completion
Richard Strong Bowen*, Huiwen Chang, Charles Herrmann*, Piotr Teterwak*, Ce Liu, Ramin Zabih

Ranking Neural Checkpoints
Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong

LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces From Video Using Pose and Lighting Normalization
Avisek Lahiri, Vivek Kwatra, Christian Frueh, John Lewis, Chris Bregler

Differentiable Patch Selection for Image Recognition
Jean-Baptiste Cordonnier*, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences
Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Rohit Pandey, Cem Keskin, Ruofei Du, Deqing Sun, Sofien Bouaziz, Sean Fanello, Ping Tan, Yinda Zhang

VIP-DeepLab: Learning Visual Perception With Depth-Aware Video Panoptic Segmentation (see the blog post)
Siyuan Qiao*, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects
Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, Marc Pollefeys

HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps
Lu Mi, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao, Chen Sun, Cordelia Schmid, Nir Shavit, Yuning Chai, Dragomir Anguelov

Wide-Baseline Relative Camera Pose Estimation With Directional Learning
Kefan Chen, Noah Snavely, Ameesh Makadia

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators
Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen

SMURF: Self-Teaching Multi-Frame Unsupervised RAFT With Full-Image Warping
Austin Stone, Daniel Maurer, Alper Ayvaci, Anelia Angelova, Rico Jonschkowski

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut

Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces
Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool

MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking
Jennifer Jang, Heinrich Jiang

Repopulating Street Scenes
Yifan Wang*, Andrew Liu, Richard Tucker, Jiajun Wu, Brian L. Curless, Steven M. Seitz, Noah Snavely

MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers (see the blog post)
Huiyu Wang*, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

IBRNet: Learning Multi-View Image-Based Rendering
Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser

From Points to Multi-Object 3D Reconstruction
Francis Engelmann*, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari

Learning Compositional Representation for 4D Captures With Neural ODE
Boyan Jiang, Yinda Zhang, Xingkui Wei, Xiangyang Xue, Yanwei Fu

Guided Integrated Gradients: An Adaptive Path Method for Removing Noise
Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, Tolga Bolukbasi

De-Rendering the World’s Revolutionary Artefacts
Shangzhe Wu*, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, Angjoo Kanazawa

Spatiotemporal Contrastive Video Representation Learning
Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui

Decoupled Dynamic Filter Networks
Jingkai Zhou, Varun Jampani, Zhixiong Pi, Qiong Liu, Ming-Hsuan Yang

NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering Using RGB Cameras
Xin Suo, Yuheng Jiang, Pei Lin, Yingliang Zhang, Kaiwen Guo, Minye Wu, Lan Xu

Regularizing Generative Adversarial Networks Under Limited Data
Hung-Yu Tseng*, Lu Jiang, Ce Liu, Ming-Hsuan Yang, Weilong Yang

SceneGraphFusion: Incremental 3D Scene Graph Prediction From RGB-D Sequences
Shun-Cheng Wu, Johanna Wald, Keisuke Tateno, Nassir Navab, Federico Tombari

NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis
Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, Jonathan T. Barron

Adversarially Adaptive Normalization for Single Domain Generalization
Xinjie Fan*, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, Mingyuan Zhou

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, Joongkyu Kim

Adversarial Robustness Across Representation Spaces
Pranjal Awasthi, George Yu, Chun-Sung Ferng, Andrew Tomkins, Da-Cheng Juan

Background Splitting: Finding Rare Classes in a Sea of Background
Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon Fatahalian

Searching for Fast Model Families on Datacenter Accelerators
Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc Le, Norman P. Jouppi

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild With Pose Annotations (see the blog post)
Adel Ahmadyan, Liangkai Zhang, Jianing Wei, Artsiom Ablavatski, Matthias Grundmann

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization
Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, Tomas Pfister

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food
Quin Thames, Arjun Karpur, Wade Norris, Fangting Xia, Liviu Panait, Tobias Weyand, Jack Sim

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning
Chen Wei*, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang

DetectoRS: Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution
Siyuan Qiao, Liang-Chieh Chen, Alan Yuille

DeRF: Decomposed Radiance Fields
Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi

Variational Transformer Networks for Layout Generation (see the blog post)
Diego Martin Arroyo, Janis Postels, Federico Tombari

Rich Features for Perceptual Quality Assessment of UGC Videos
Yilin Wang, Junjie Ke, Hossein Talebi, Joong Gon Yim, Neil Birkbeck, Balu Adsumilli, Peyman Milanfar, Feng Yang

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds
Li Yi, Boqing Gong, Thomas Funkhouser

Neural Descent for Visual 3D Human Pose and Shape
Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation
Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji

Look Before You Speak: Visually Contextualized Utterances
Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid

LASR: Learning Articulated Shape Reconstruction From a Monocular Video
Gengshan Yang*, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu

MoViNets: Mobile Video Networks for Efficient Video Recognition
Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong

No Shadow Left Behind: Removing Objects and Their Shadows Using Approximate Lighting and Geometry
Edward Zhang, Ricardo Martin-Brualla, Janne Kontkanen, Brian Curless

On Robustness and Transferability of Convolutional Neural Networks
Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D’Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic

Robust and Accurate Object Detection via Adversarial Learning
Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong

To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels
Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Benjamin Caine, Vijay Vasudevan, Xiao Zhang, Dragomir Anguelov

Bottleneck Transformers for Visual Recognition
Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani

Faster Meta Update Strategy for Noise-Robust Deep Learning
Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang

Correlated Input-Dependent Label Noise in Large-Scale Image Classification
Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

Learned Initializations for Optimizing Coordinate-Based Neural Representations
Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng

Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi, Yin Cui, Aravind Srinivas*, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph

Function4D: Real-Time Human Volumetric Capture From Very Sparse Consumer RGBD Sensors
Tao Yu, Zerong Zheng, Kaiwen Guo, Pengpeng Liu, Qionghai Dai, Yebin Liu

RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin Elsayed, Alex Bewley, Xiao Zhang, Cristian Sminchisescu, Dragomir Anguelov

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections
Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, Daniel Duckworth

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments
Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, Leonidas Guibas

Taskology: Utilizing Task Relations at Scale
Yao Lu, Sören Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad*, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon

Omnimatte: Associating Objects and Their Effects in Video
Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein

AutoFlow: Learning a Better Training Set for Optical Flow
Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, and Ce Liu

Unsupervised Multi-Source Domain Adaptation Without Access to Source Data
Sk Miraj Ahmed, Dripta S. Raychaudhuri, Sujoy Paul, Samet Oymak, Amit K. Roy-Chowdhury

Meta Pseudo Labels
Hieu Pham, Zihang Dai, Qizhe Xie, Minh-Thang Luong, Quoc V. Le

Spatially-Varying Outdoor Lighting Estimation From Intrinsics
Yongjie Zhu, Yinda Zhang, Si Li, Boxin Shi

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
Long Zhao*, Yuxiao Wang, Jiaping Zhao, Liangzhe Yuan, Jennifer J. Sun, Florian Schroff, Hartwig Adam, Xi Peng, Dimitris Metaxas, Ting Liu

Benchmarking Representation Learning for Natural World Image Collections
Grant Van Horn, Elijah Cole, Sara Beery, Kimberly Wilber, Serge Belongie, Oisin Mac Aodha

Scaling Local Self-Attention for Parameter Efficient Visual Backbones
Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control
Tomas Jakab*, Richard Tucker, Ameesh Makadia, Jiajun Wu, Noah Snavely, Angjoo Kanazawa

HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching
Vladimir Tankovich, Christian Häne, Yinda Zhang, Adarsh Kowdle, Sean Fanello, Sofien Bouaziz

POSEFusion: Pose-Guided Selective Fusion for Single-View Human Volumetric Capture
Zhe Li, Tao Yu, Zerong Zheng, Kaiwen Guo, Yebin Liu

Workshops (only Google affiliations are noted)

Media Forensics
Organizers: Christoph Bregler

Safe Artificial Intelligence for Automated Driving
Invited Speakers: Been Kim

VizWiz Grand Challenge
Organizers: Meredith Morris

3D Vision and Robotics
Invited Speaker: Andy Zeng

New Trends in Image Restoration and Enhancement Workshop and Challenges on Image and Video Processing
Organizers: Ming-Hsuan Yang Program Committee: George Toderici, Ming-Hsuan Yang

2nd Workshop on Extreme Vision Modeling
Invited Speakers: Quoc Le, Chen Sun

First International Workshop on Affective Understanding in Video
Organizers: Gautam Prasad, Ting Liu

Adversarial Machine Learning in Real-World Computer Vision Systems and Online Challenges
Program Committee: Nicholas Carlini, Nicolas Papernot

Ethical Considerations in Creative Applications of Computer Vision
Invited Speaker: Alex Hanna Organizers: Negar Rostamzadeh, Emily Denton, Linda Petrini

Visual Question Answering Workshop
Invited Speaker: Vittorio Ferrari

Sixth International Skin Imaging Collaboration (ISIC) Workshop on Skin Image Analysis
Invited Speakers: Sandra Avila Organizers: Yuan Liu Steering Committee: Yuan Liu, Dale Webster

The 4th Workshop and Prize Challenge: Bridging the Gap between Computational Photography and Visual Recognition (UG2+) in Conjunction with IEEE CVPR 2021
Invited Speakers: Peyman Milanfar, Chelsea Finn

The 3rd CVPR Workshop on 3D Scene Understanding for Vision, Graphics, and Robotics
Invited Speaker: Andrea Tagliasacchi

Robust Video Scene Understanding: Tracking and Video Segmentation
Organizers: Jordi Pont-Tuset, Sergi Caelles, Jack Valmadre, Alex Bewley

4th Workshop and Challenge on Learned Image Compression
Invited Speaker: Rianne van den Berg Organizers: George Toderici, Lucas Theis, Johannes Ballé, Eirikur Agustsson, Nick Johnston, Fabian Mentzer

The Third Workshop on Precognition: Seeing Through the Future
Invited Speaker: Anelia Angelova
Organizers: Utsav Prabhu Program Committee: Chen Sun, David Ross

Computational Cameras and Displays
Organizers: Tali Dekel Keynote Talks: Paul Debevec Program Committee: Ayan Chakrabarti, Tali Dekel

2nd Embodied AI Workshop
Organizing Committee: Anthony Francis Challenge Organizers: Peter Anderson, Anthony Francis, Alex Ku, Alexander Toshev Scientific Advisory Board: Alexander Toshev

Responsible Computer Vision
Program Committee: Caroline Pantofaru, Utsav Prabhu, Susanna Ricco, Negar Rostamzadeh, Candice Schumann

Dynamic Neural Networks Meets Computer Vision
Invited Speaker: Azalia Mirhoseini

Interactive Workshop on Bridging the Gap between Subjective and Computational Measurements of Machine Creativity
Invited Speaker: David Bau

GAZE 2021: The 3rd International Workshop on Gaze Estimation and Prediction in the Wild
Organizer: Thabo Beeler Program Committee: Thabo Beeler

Sight and Sound
Organizers: William Freeman

Future of Computer Vision Datasets
Invited Speaker: Emily Denton, Caroline Pantofaru

Open World Vision
Invited Speakers: Rahul Sukthankar

The 3rd Workshop on Learning from Unlabeled Videos
Organizers: Anelia Angelova, Honglak Lee Program Committee: AJ Piergiovanni

4th International Workshop on Visual Odometry and Computer Vision Applications Based on Location Clues — With a Focus on Mobile Platform Applications
Organizers: Anelia Angelova

4th Workshop on Efficient Deep Learning for Computer Vision
Invited Speaker: Andrew Howard
Organizers: Pete Warden, Andrew Howard

Second International Workshop on Large Scale Holistic Video Understanding
Invited Speaker: Cordelia Schmid Program Committee: AJ Piergiovanni Organizers: David Ross

Neural Architecture Search 1st Lightweight NAS Challenge and Moving Beyond
Invited Speakers: Sara Sabour

The Second Workshop on Fair, Data-Efficient, and Trusted Computer Vision
Invited Speakers: Gaurav Aggarwal

The 17th Embedded Vision Workshop
General Chair: Anelia Angelova

8th Workshop on Fine-Grained Visual Categorization
Organizers: Christine Kaeser-Chen, Kimberly Wilber

AI for Content Creation
Invited Speaker: Tali Dekel, Jon Barron, Emily Denton Organizers: Deqing Sun

Frontiers of Monocular 3D Perception
Invited Speakers: Anelia Angelova, Cordelia Schmid, Noah Snavely

Beyond Fairness: Towards a Just, Equitable, and Accountable Computer Vision
Organizers: Emily Denton

The 1st Workshop on Future Video Conferencing
Invited Speakers: Chuo-Ling Chang, Sergi Caelles

Tutorials (only Google affiliations are noted)

Tutorial on Fairness Accountability Transparency and Ethics in Computer Vision
Organizer: Emily Denton

Data-Efficient Learning in An Imperfect World
Organizers: Boqing Gong, Ting Chen

Semantic Segmentation of Point Clouds: a Deep Learning Framework for Cultural Heritage
Invited Speaker: Manzil Zaheer

From VQA to VLN: Recent Advances in Vision-and-Language Research
Organizer: Peter Anderson * Indicates work done while at Google