Categories
Offsites

Does Your Medical Image Classifier Know What It Doesn’t Know?

Deep machine learning (ML) systems have achieved considerable success in medical image analysis in recent years. One major contributing factor is access to abundant labeled datasets, which are used to train highly effective supervised deep learning models. However, in the real-world, these models may encounter samples exhibiting rare conditions that are individually too infrequent for per-condition classification. Nevertheless, such conditions can be collectively common because they follow a long-tail distribution and when taken together can represent a significant portion of cases — e.g., in a recent deep learning dermatological study, hundreds of rare conditions composed around 20% of cases encountered by the model at test time.

To prevent models from generating erroneous outputs on rare samples at test time, there remains a considerable need for deep learning systems with the ability to recognize when a sample is not a condition it can identify. Detecting previously unseen conditions can be thought of as an out-of-distribution (OOD) detection task. By successfully identifying OOD samples, preventive measures can be taken, like abstaining from prediction or deferring to a human expert.

Traditional computer vision OOD detection benchmarks work to detect dataset distribution shifts. For example, a model may be trained on CIFAR images but be presented with street view house numbers (SVHN) as OOD samples, two datasets with very different semantic meanings. Other benchmarks seek to detect slight differences in semantic information, e.g., between images of a truck and a pickup truck, or two different skin conditions. The semantic distribution shifts in such near-OOD detection problems are more subtle in comparison to dataset distribution shifts, and thus, are harder to detect.

In “Does Your Dermatology Classifier Know What it Doesn’t Know? Detecting the Long-Tail of Unseen Conditions”, published in Medical Image Analysis, we tackle this near-OOD detection task in the application of dermatology image classification. We propose a novel hierarchical outlier detection (HOD) loss, which leverages existing fine-grained labels of rare conditions from the long tail and modifies the loss function to group unseen conditions and improve identification of these near OOD categories. Coupled with various representation learning methods and the diverse ensemble strategy, this approach enables us to achieve better performance for detecting OOD inputs.

The Near-OOD Dermatology Dataset
We curated a near-OOD dermatology dataset that includes 26 inlier conditions, each of which are represented by at least 100 samples, and 199 rare conditions considered to be outliers. Outlier conditions can have as low as one sample per condition. The separation criteria between inlier and outlier conditions can be specified by the user. Here the cutoff sample size between inlier and outlier was 100, consistent with our previous study. The outliers are further split into training, validation, and test sets that are intentionally mutually exclusive to mimic real-world scenarios, where rare conditions shown during test time may have not been seen in training.

Long tail distribution of different dermatological conditions in our dataset. The 26 inlier conditions, with at least 100 samples, (blue) and the remaining 199 rare outlier conditions (orange). Outlier conditions can have as low as one sample per condition.
    Train set  Validation set      Test set
Inlier Outlier Inlier Outlier Inlier Outlier
Number of classes 26 68 26 66 26 65
Number of samples 8854 1111 1251 1082 1192 937
Inlier and outlier conditions in our benchmark dataset and detailed dataset split statistics. The outliers are further split into mutually exclusive train, validation, and test sets.

Hierarchical Outlier Detection Loss
We propose to use “known outlier” samples during training that are leveraged to aid detection of “unknown outlier” samples during test time. Our novel hierarchical outlier detection (HOD) loss performs a fine-grained classification of individual classes for all inlier or outlier classes and, in parallel, a coarse-grained binary classification of inliers vs. outliers in a hierarchical setup (see the figure below). Our experiments confirmed that HOD is more effective than performing a coarse-grained classification followed by a fine-grained classification, as this could result in a bottleneck that impacted the performance of the fine-grained classifier.

We use the sum of the predictive probabilities of the outlier classes as the OOD score. As a primary OOD detection metric we use the area under receiver operating characteristics (AUROC) curve, which ranges between 0 and 1 and gives us a measure of separability between inliers and outliers. A perfect OOD detector, which separates all inliers from outliers, is assigned an AUROC score of 1. A popular baseline method, called reject bucket, separates each inlier individually from the outliers, which are grouped into a dedicated single abstention class. In addition to a fine-grained classification for each individual inlier and outlier classes, the HOD loss–based approach separates the inliers collectively from the outliers with a coarse-grained prediction loss, resulting in better generalization. While similar, we demonstrate that our HOD loss–based approach outperforms other baseline methods that leverage outlier data during training, achieving an AUROC score of 79.4% on the benchmark, a significant improvement over that of reject bucket, which achieves 75.6%.

Our model architecture and the HOD loss. The encoder (green) represents the wide ResNet 101×3 model pre-trained with different representation learning models (ImageNet, BiT, SimCLR, and MICLe; see below). The output of the encoder is sent to the HOD loss where fine-grained and coarse-grained predictions for inliers (blue) and outliers (orange) are obtained. The coarse predictions are obtained by summing over the fine-grained probabilities as indicated in the figure. The OOD score is defined as the sum of the probabilities of outlier classes.

Representation Learning and the Diverse Ensemble Strategy
We also investigate how different types of representation learning help in OOD detection in conjunction with HOD by pretraining on ImageNet, BiT-L, SimCLR and MICLe models. We observe that including HOD loss improves OOD performance compared to the reject bucket baseline method for all four representation learning methods.

Representation Learning
Methods
OOD detection metric (AUROC %)
With reject bucket With HOD loss
ImageNet 74.7% 77%
BiT-L 75.6% 79.4%
SimCLR 75.2% 77.2%
MICLe 76.7% 78.8%
OOD detection performance for different representation learning models with reject bucket and with HOD loss.

Another orthogonal approach for improving OOD detection performance and accuracy is deep ensemble, which aggregates outputs from multiple independently trained models to provide a final prediction. We build upon deep ensemble, but instead of using a fixed architecture with a fixed pre-training, we combine different representation learning architectures (ImageNet, BiT-L, SimCLR and MICLe) and introduce objective loss functions (HOD and reject bucket). We call this a diverse ensemble strategy, which we demonstrate outperforms the deep ensemble for OOD performance and inlier accuracy.

Downstream Clinical Trust Analysis
While we mainly focus on improving the performance for OOD detection, the ultimate goal for our dermatology model is to have high accuracy in predicting inlier and outlier conditions. We go beyond traditional performance metrics and introduce a “penalty” matrix that jointly evaluates inlier and outlier predictions for model trust analysis to approximate downstream impact. For a fixed confidence threshold, we count the following types of mistakes: (i) incorrect inlier predictions (i.e., mistaking inlier condition A as inlier condition B); (ii) incorrect abstention of inliers (i.e., abstaining from making a prediction for an inlier); and (iii) incorrect prediction for outliers as one of the inlier classes.

To account for the asymmetrical consequences of the different types of mistakes, penalties can be 0, 0.5, or 1. Both incorrect inlier and outlier-as-inlier predictions can potentially erode user trust in the model and were penalized with a score of 1. Incorrect abstention of an inlier as an outlier was penalized with a score of 0.5, indicating that potential model users should seek additional guidance given the model-expressed uncertainty or abstention. For correct decisions no cost is incurred, indicated by a score of 0.

                  Action of the Model
Prediction as Inlier Abstain
Inlier 0 (Correct)

1 (Incorrect, mistakes
that may erode trust)

0.5 (Incorrect,
abstains inliers)
Outlier     1 (Incorrect, mistakes
that may erode trust)
0 (Correct)
The penalty matrix is designed to capture the potential impact of different types of model errors.

Because real-world scenarios are more complex and contain a variety of unknown variables, the numbers used here represent simplifications to enable qualitative approximations for the downstream impact on user trust of outlier detection models, which we refer to as “cost”. We use the penalty matrix to estimate a downstream cost on the test set and compare our method against the baseline, thereby making a stronger case for its effectiveness in real-world scenarios. As shown in the plot below, our proposed solution incurs a much lower estimated cost in comparison to baseline over all possible operating points.

Trust analysis comparing our proposed method to the baseline (reject bucket) for a range of outlier recall rates, indicated by 𝛕. We show that our method reduces downstream estimated cost, potentially reflecting improved downstream impact.

Conclusion
In real-world deployment, medical ML models may encounter conditions that were not seen in training, and it’s important that they accurately identify when they do not know a specific condition. Detecting those OOD inputs is an important step to improving safety. We develop an HOD loss that leverages outlier data during training, and combine it with pre-trained representation learning models and a diverse ensemble to further boost performance, significantly outperforming the baseline approach on our new dermatology benchmark dataset. We believe that our approach, aligned with our AI Principles, can aid successful translation of ML algorithms into real-world scenarios. Although we have primarily focused on OOD detection for dermatology, most of our contributions are fairly generic and can be easily incorporated into OOD detection for other applications.

Acknowledgements
We would like to thank Shekoofeh Azizi, Aaron Loh, Vivek Natarajan, Basil Mustafa, Nick Pawlowski, Jan Freyberg, Yuan Liu, Zach Beaver, Nam Vo, Peggy Bui, Samantha Winter, Patricia MacWilliams, Greg S. Corrado, Umesh Telang, Yun Liu, Taylan Cemgil, Alan Karthikesalingam, Balaji Lakshminarayanan, and Jim Winkens for their contributions. We would also like to thank Tom Small for creating the post animation.

Categories
Misc

Edge Computing Fuels a Sustainable Future for Energy

Learn how edge computing is powering efficient energy operations, protecting worker health and safety, and improving power grid resiliency.

Each day, energy flows throughout our lives – from the fuel that powers cars and planes, to the gas used for stove top cooking, to the electricity that keeps the lights on in homes and businesses. Oil, gas, and electricity are mature commodity markets, but AI is transforming the processes used to produce, transport, and deliver these resources. 

Enter AI deployed at the edge: on oil rigs, within power plants, riding along utility trucks, even embedded in smart buildings. Oil and gas enterprises and utilities are using AI and edge computing to improve operational efficiency, protect worker health and safety, integrate renewable energy, increase grid resiliency, and provide more reliable and affordable sources of energy to consumers.

Image of smart camera pilot deployment from Noteworthy AI used by FirstEnergy to automate inspections of utility poles at the edge.
Figure 1. Noteworthy AI, a member of NVIDIA Inception, put smart cameras on FirstEnergy’s trucks in a pilot that showed how edge computing can monitor millions of pole-mounted assets. Image courtesy of Noteworthy AI.

As companies and countries race to decarbonize and meet net-zero emissions goals, edge AI will play a key role managing distributed energy resources such as electric vehicles, home batteries, solar panels, and wind farms to enhance power grid resiliency and accelerate the energy transition. The following examples highlight the top AI use cases across the energy industry, including:

  • Software-defined smart grids: Future smart meters will use edge computing to optimize power flow, detect grid anomalies, deliver more reliable energy at a lower cost, and unlock opportunities for new energy applications. Utilidata, a leading grid-edge software company, is developing a software-defined smart grid chip with NVIDIA that will power next-generation smart meters to increase grid resiliency, decarbonization, and consumer value. 
  • Autonomous operations: Industrial sites, such as oil rigs and power plants, require extensive monitoring for efficiency and safety because liquid, steam, or oil leakages can be catastrophic, costly, and wasteful. Global energy leaders, such as Siemens Energy, are using AI and machine learning to deliver a path to autonomous power plants. The company trains AI models using thousands of images and video streams from millions of onsite cameras and sensors to detect process anomalies. These models are deployed at the edge in power plants and use real-time inferencing to identify leaks. Rig operators are using computer vision, deep learning, and intelligent video analytics (IVA) to monitor heavy machinery, detect potential hazards, and alert workers in real-time to protect their health and safety, prevent accidents, and assign repair technicians for maintenance. 
  • Pipeline optimization: Oil and gas enterprises rely on finding the best-fit routes to transfer oil to refineries and eventually fuel stations. Edge AI can calculate the optimal flow of oil to ensure reliability of production and protect long-term pipeline health. Using IVA, these companies can inspect pipelines for defects that could lead to dangerous failures and automatically alert pipeline operators. Further downstream, NVIDIA ReOpt uses GPU-accelerated solvers for logistics and route optimization, which can efficiently route fuel to fueling stations.
  • Power grid maintenance: With proactive maintenance, utilities can accurately detect defects and reduce unplanned outages to better serve customers. FirstEnergy worked with Noteworthy AI, a NVIDIA Inception member, on a pilot project to automate utility pole inspections. Fixed camera systems powered by NVIDIA Jetson were secured to the roof of service trucks and collected standardized, high-resolution images of their utility poles, power lines, and pole-mounted assets. The images were analyzed at the edge to determine if repairs or vegetation management was needed. Edge computing can help monitor the estimated 185 million utility poles in the United States, and reduce the tens of millions of dollars spent each year by utilities to manually track and maintain poles. 
  • Power grid simulation: Intelligent forecasting using GPU-accelerated grid simulations combined with historical data on energy usage and weather can inform more efficient generation, distribution, and management of energy resources to consumers. AI helps manage the bidirectional flow of power in a grid, delivering reliable energy to residents and enterprises while automating the process for consumers to sell their additional energy back to the grid.

Thanks to edge AI, the future of energy is more sustainable than ever. Explore how NVIDIA is building an ecosystem to accelerate the energy transition.

Categories
Misc

Let Me Upgrade You: GeForce NOW Adds Resolution Upscaling and More This GFN Thursday

GeForce NOW is taking cloud gaming to new heights. This GFN Thursday delivers an upgraded streaming experience as part of an update that is now available to all members. It includes new resolution upscaling options to make members’ gaming experiences sharper, plus the ability to customize streaming settings in session. The GeForce NOW app is Read article >

The post Let Me Upgrade You: GeForce NOW Adds Resolution Upscaling and More This GFN Thursday appeared first on The Official NVIDIA Blog.

Categories
Misc

Nearly 80 Percent of Financial Firms Use AI to Improve Services, Reduce Fraud

From the largest firms trading on Wall Street to banks providing customers with fraud protection to fintechs recommending best-fit products to consumers, AI is driving innovation across the financial services industry. New research from NVIDIA found that 78 percent of financial services professionals state that their company uses accelerated computing to deliver AI-enabled applications through Read article >

The post Nearly 80 Percent of Financial Firms Use AI to Improve Services, Reduce Fraud appeared first on The Official NVIDIA Blog.

Categories
Misc

What does this error mean and how do I get around it?

I have replicated an architecture from one of the research papers and running it on GPU gives Out Of Memory even on colab (The model is quite deep and huge). So naturally, I want to train it using a TPU.

The same code using GPU doesn’t cause any issue. However it throws this error if I train on TPU:

InvalidArgumentError: 9 root error(s) found. (0) INVALID_ARGUMENT: {{function_node __inference_train_function_56959}} Reshape’s input dynamic dimension is decomposed into multiple output dynamic dimensions, but the constraint is ambiguous and XLA can’t infer the output dimension %reshape.8395 = f32[3,3,2,86,86,32]{5,4,3,2,1,0} reshape(f32[<=18,86,86,32]{3,2,1,0} %convolution.8393), metadata={op_type=”BatchToSpaceND” op_name=”model/conv2d_3/Conv2D/BatchToSpaceND”}.

[[{{node TPUReplicate/_compile/_11021259981217135469/_4}}]]

The colab notebook can be found here:

https://colab.research.google.com/drive/1D-laydrWwLnqAehVREhSkTPoqNXyRGS8?usp=sharing

submitted by /u/SuccMyStrangerThings
[visit reddit] [comments]

Categories
Misc

Face trained modal works but detects all faces and the faces it was supposed to detect

Hey all!

I have trained a TensorFlow modal with faces of people I want to detect.

While it detects the people and gives the correct label to the faces, I trained the model to detect. If I point the webcam at a face that I did not train the model with, it gives a label of one of the people I trained the model with.

I’ve tried many things to stop this, but nothing has worked.

I can share all the code and faces I am trying to detect if needed, but is there any way to stop this?

Any advice is greatly appreciated! I’m still learning TensorFlow and while I’m a little better than my pervious posts I’m still learning!

Thanks!

submitted by /u/Adhesive_Hooks
[visit reddit] [comments]

Categories
Misc

HELP! Persisting CUDA error with tensorflow

Hi everyone. I’m trying to make tensorflow use my NVIDIA GTX 1060 gpu in my laptop. I created a python environment and installed tensorflow, python, pip, etc. I am using Ubuntu on Windows (so wsl-ubuntu). On CMD, the nvidia-smi command is showing my GPU. But with tensorflow, I get the following error:

2022-01-26 21:45:36.677191: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected 2022-01-26 21:45:36.678074: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (DESKTOP-P8QAQC0): /proc/driver/nvidia/version does not exist Num GPUs Available: 0 

I have CUDA 11.5 and 11.6 installed, with cudNN 8.3.2.44 installed. I manually copied and pasted the files into the CUDA directory and ran the exe (exe didn’t seem to install files though). I am not sure what else to do. Help would be really appreciated!

submitted by /u/AryInd
[visit reddit] [comments]

Categories
Misc

New on NGC: Security Reports, Latest Containers for PyTorch, TensorFlow, HPC and More

This month the NGC catalog added new containers, model resumes, container security scan reports, and more to help identify and deploy AI software faster.

The NVIDIA NGC catalog is a hub for GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software containers, pretrained models, industry-specific SDKs, and Jupyter Notebooks the content helps simplify and accelerate end-to-end workflows. 

New features, software, and updates to help you streamline your workflow and build your solutions faster on NGC include:

Model resumes

The NGC catalog offers state-of-the-art pretrained models that help you build your custom models faster with just a fraction of the training data.

Now, every model comes with a resume that provides information on model architecture, training parameters, training datasets, performance, and limitations to help you make informed decisions before downloading the model. They also include instructions on how to use the model so you can focus on AI development.

View the demo video and explore models for applications like speech and computer vision in various industries including Retail, Healthcare, Smart Cities, and Manufacturing.

Container security scan reports

All the container images in the NGC catalog are scanned for CVEs, malware, crypto keys, open ports, and more.

Now, the containers come with a security scan report, which provides a security rating of that image, breakdown of CVE severity by package, and links to detailed information on CVEs. 

The scan reports are available on the latest as well as the previous versions of the images and with the entire NGC catalog scanned every 30 days. If you’re using an older version with high or critical severity, the scan report will flag the vulnerabilities and suggest remedies.

View the demo video for more details and explore application containers for deep learning, machine learning, and HPC.

TAO Toolkit

The latest version of the TAO Toolkit is now available for download. The TAO Toolkit, a CLI, and Jupyter notebook-based version of TAO, brings together several new capabilities to help you speed up your model creation process. 

Key highlights include:

Deep learning software

The most popular deep learning frameworks for training and inference are updated monthly. Pull the latest version (v22.01) of:

M-Star CFD

M-Star CFD is a multiphysics modeling package used to simulate fluid flow, heat transfer, species transport, chemical reactions, particle transport, and rigid-body dynamics. 

M-Star CFD contains M-Star Build (to prepare models and specify simulation parameters), M-Star Solve (to run simulations), and M-Star Post (to render and plot data.)

HPC applications

Latest versions of the popular HPC applications are also available in the NGC catalog.

Visit the NGC catalog to see how the GPU-optimized software can help simplify workflows and speedup solution times.

Categories
Misc

tensorflow_datasets … OverflowError? 😭

Hello. Although I have searched online, I don’t understand what’s wrong. Is my laptop not strong enough?? Is it because I am using Anaconda?? I was just trying to follow along with this tutorial: “Tensorflow – Convolutional Neural Networks: Evaluating the Model | Learn | freeCodeCamp.org” 🤷‍♀️

  • 1.) It is installed: Requirement already satisfied: colorama in c:usersglassanaconda3libsite-packages (from tqdm->tensorflow-datasets) (0.4.4)
  • 2.) Reset the kernel & tried to import: import tensorflow_datasets as tfds
  • 3.) The error:
    ~anaconda3libsite-packagestensorflow_datasetsvision_languagewitwit.py in <module>
    23 import tensorflow_datasets.public_api as tfds
    24
    —> 25 csv.field_size_limit(sys.maxsize)
    26
    27 _DESCRIPTION = “””
    OverflowError: Python int too large to convert to C long

submitted by /u/spinach_pi
[visit reddit] [comments]

Categories
Offsites

Resolving High-Energy Impacts on Quantum Processors

Quantum processors are made of superconducting quantum bits (qubits) that — being quantum objects — are highly susceptible to even tiny amounts of environmental noise. This noise can cause errors in quantum computation that need to be addressed to continue advancing quantum computers. Our Sycamore processors are installed in specially designed cryostats, where they are sealed away from stray light and electromagnetic fields and are cooled down to very low temperatures to reduce thermal noise.

However, the world is full of high-energy radiation. In fact, there’s a tiny background of high-energy gamma rays and muons that pass through everything around us all the time. While these particles interact so weakly that they don’t cause any harm in our day-to-day lives, qubits are sensitive enough that even weak particle interactions can cause significant interference.

In “Resolving Catastrophic Error Bursts from Cosmic Rays in Large Arrays of Superconducting Qubits”, published in Nature Physics, we identify the effects of these high-energy particles when they impact the quantum processor. To detect and study individual impact events, we use new techniques in rapid, repetitive measurement to operate our processor like a particle detector. This allows us to characterize the resulting burst of errors as they spread through the chip, helping to better understand this important source of correlated errors.

The Dynamics of a High-Energy Impact
The Sycamore quantum processor is constructed with a very thin layer of superconducting aluminum on a silicon substrate, onto which a pattern is etched to define the qubits. At the center of each qubit is the Josephson junction, a superconducting component that defines the distinct energy levels of the qubit, which are used for computation. In a superconducting metal, electrons bind together into a macroscopic, quantum state, which allows electrons to flow as a current with zero resistance (a supercurrent). In superconducting qubits, information is encoded in different patterns of oscillating supercurrent going back and forth through the Josephson junction.

If enough energy is added to the system, the superconducting state can be broken up to produce quasiparticles. These quasiparticles are a problem, as they can absorb energy from the oscillating supercurrent and jump across the Josephson junction, which changes the qubit state and produces errors. To prevent any energy from being absorbed by the chip and producing quasiparticles, we use extensive shielding for electric and magnetic fields, and powerful cryogenic refrigerators to keep the chip near absolute zero temperature, thus minimizing the thermal energy.

A source of energy that we can’t effectively shield against is high-energy radiation, which includes charged particles and photons that can pass straight through most materials. One source of these particles are tiny amounts of radioactive elements that can be found everywhere, e.g., in building materials, the metal that makes up our cryostats, and even in the air. Another source is cosmic rays, which are extremely energetic particles produced by supernovae and black holes. When cosmic rays impact the upper atmosphere, they create a shower of high-energy particles that can travel all the way down to the surface and through our chip. Between radioactive impurities and cosmic ray showers, we expect a high energy particle to pass through a quantum chip every few seconds.

When a high-energy impact event occurs, energy spreads through the chip in the form of phonons. When these arrive at the superconducting qubit layer, they break up the superconducting state and produce quasiparticles, which cause the qubit errors we observe.

When one of these particles impinges on the chip, it passes straight through and deposits a small amount of its energy along its path through the substrate. Even a small amount of energy from these particles is a very large amount of energy for the qubits. Regardless of where the impact occurs, the energy quickly spreads throughout the entire chip through quantum vibrations called phonons. When these phonons hit the aluminum layer that makes up the qubits, they have more than enough energy to break the superconducting state and produce quasiparticles. So many quasiparticles are produced that the probability of the qubits interacting with one becomes very high. We see this as a sudden and significant increase in errors over the whole chip as those quasiparticles absorb energy from the qubits. Eventually, as phonons escape and the chip cools, these quasiparticles recombine back into the superconducting state, and the qubit error rates slowly return to normal.

A high-energy particle impact (at time = 0 ms) on a patch of the quantum processor, showing error rates for each qubit over time. The event starts by rapidly spreading error over the whole chip, before saturating and then slowly returning to equilibrium.

Detecting Particles with a Computer
The Sycamore processor is designed to perform quantum error correction (QEC) to improve the error rates and enable it to execute a variety of quantum algorithms. QEC provides an effective way of identifying and mitigating errors, provided they are sufficiently rare and independent. However, in the case of a high-energy particle going through the chip, all of the qubits will experience high error rates until the event cools off, producing a correlated error burst that QEC won’t be able to correct. In order to successfully perform QEC, we first have to understand what these impact events look like on the processor, which requires operating it like a particle detector.

To do so, we take advantage of recent advances in qubit state preparation and measurement to quickly prepare each qubit in their excited state, similar to flipping a classical bit from 0 to 1. We then wait for a short idle time and measure whether they are still excited. If the qubits are behaving normally, almost all of them will be. Further, the qubits that experience a decay out of their excited state won’t be correlated, meaning the qubits that have errors will be randomly distributed over the chip.

However, during the experiment we occasionally observe large error bursts, where all the qubits on the chip suddenly become more error prone all at once. This correlated error burst is a clear signature of a high-energy impact event. We also see that, while all qubits on the chip are affected by the event, the qubits with the highest error rates are all concentrated in a “hotspot” around the impact site, where slightly more energy is deposited into the qubit layer by the spreading phonons.

To detect high-energy impacts, we rapidly prepare the qubits in an excited state, wait a little time, and then check if they’ve maintained their state. An impact produces a correlated error burst, where all the qubits show a significantly elevated error rate, as shown around time = 8 seconds above.

Next Steps
Because these error bursts are severe and quickly cover the whole chip, they are a type of correlated error that QEC is unable to correct. Therefore, it’s very important to find a solution to mitigate these events in future processors that are expected to rely on QEC.

Shielding against these particles is very difficult and typically requires careful engineering and design of the cryostat and many meters of shielding, which becomes more impractical as processors grow in size. Another approach is to modify the chip, allowing it to tolerate impacts without causing widespread correlated errors. This is an approach taken in other complex superconducting devices like detectors for astronomical telescopes, where it’s not possible to use shielding. Examples of such mitigation strategies include adding additional metal layers to the chip to absorb phonons and prevent them from getting to the qubit, adding barriers in the chip to prevent phonons spreading over long distances, and adding traps for quasiparticles in the qubits themselves. By employing these techniques, future processors will be much more robust to these high-energy impact events.

As the error rates of quantum processors continue to decrease, and as we make progress in building a prototype of an error-corrected logical qubit, we’re increasingly pushed to study more exotic sources of error. While QEC is a powerful tool for correcting many kinds of errors, understanding and correcting more difficult sources of correlated errors will become increasingly important. We’re looking forward to future processor designs that can handle high energy impacts and enable the first experimental demonstrations of working quantum error correction.

Acknowledgements
This work wouldn’t have been possible without the contributions of the entire Google Quantum AI Team, especially those who worked to design, fabricate, install and calibrate the Sycamore processors used for this experiment. Special thanks to Rami Barends and Lev Ioffe, who led this project.