DataBloom - Part 350

Misc

Unicode error. No fix working.

Post author By
Post date March 21, 2022
No Comments on Unicode error. No fix working.

Hi,

I was trying to get an image in my code using the cv2 library, and using tf to identify it. But I am getting this weird error:

SyntaxError: (unicode error) ‘unicodeescape’ codec can’t decode bytes in position 2-3: truncated UXXXXXXXX escape

———————————————————————————————————————————————

I tried using r string, double backslash and front slash but nothing worked. What’s weirder is that I have used the same code in the same file above to get a file and it works. Here is the code:

gray=cv2.cvtColor(r’C:UsersshivaDocumentsAI_ML_Tensorflowtest.png’,cv2.COLOR_BGR2GRAY)

——————————————————————————————————————————————–

Help is highly appreciated. Thanks.

submitted by /u/StarLan7
[visit reddit] [comments]

Misc

Guide to Computer Vision: Why It Matters and How It Helps Solve Problems

Post author By
Post date March 21, 2022
No Comments on Guide to Computer Vision: Why It Matters and How It Helps Solve Problems

This post unpacks the term computer vision. It answers how it works, explores common tasks and use cases, and invites you to get started.

This post was written to enable the beginner developer community, especially those new to computer vision and computer science. NVIDIA recognizes that solving and benefiting the world’s visual computing challenges through computer vision and artificial intelligence requires all of us. NVIDIA is excited to partner and dedicate this post to the Black Women in Artificial Intelligence.

Computer vision’s real world use and reach is growing and its applications in turn are challenging and changing its meaning. Computer vision, which has been in some form of its present existence for decades, is becoming an increasingly common phrase littered in conversation, across the world and across industries: computer vision systems, computer vision software, computer vision hardware, computer vision development, computer vision pipelines, computer vision technology.

What is computer vision?

There is more to the term and field of computer vision than meets the eye, both literally and figuratively. Computer vision is also referred to as vision AI and traditional image processing in specific non-AI instances, and machine vision in manufacturing and industrial use cases.

Simply put, computer vision enables devices, including laptops, smartphones, self-driving cars, robots, drones, satellites, and x-ray machines to perceive, process, analyze, and interpret data in digital images and video.

In other words, computer vision fundamentally intakes image data or image datasets as inputs, including both still images and moving frames of a video, either recorded or from a live camera feed. Computer vision enables devices to have and use human-like vision capabilities just like our human vision system. In human vision, your eyes perceive the physical world around you as different reflections of light in real-time.

Similarly, computer vision devices perceive pixels of images and videos, detecting patterns and interpreting image inputs that can be used for further analysis or decision making. In this sense, computer vision “sees” just like human vision and uses intelligence and compute power to process input visual data to output meaningful insights, like a robot detecting and avoiding an obstacle in its path.

Different computer vision tasks mimic the human vision system, performing, automating, and enhancing functions similar to the human vision system.

How does computer vision relate to other forms of AI?

Computer vision is helping to teach and master seeing, just like conversational AI is helping teach and master the sense of sound through speech, in applications of recognizing, translating, and verbalizing text: the words we use to define and describe the physical world around us.

Similarly, computer vision helps teach and master the sense of sight through digital image and video. More broadly, the term computer vision can also be used to describe how device sensors, typically cameras, perceive and work as vision systems in applications of detecting, tracking and recognizing objects or patterns in images.

Multimodal conversational AI combines the capabilities of conversational AI with computer vision in multimedia conferencing applications, such as NVIDIA Maxine.

Computer vision can also be used broadly to describe how other types of sensors like light detection and ranging (LiDAR) and radio detection and ranging (RADAR) perceive the physical world. In self-driving cars, computer vision is used to describe how LiDAR and RADAR sensors work, often together and in-tandem with cameras to recognize and classify people, objects, and debris.

What are some common tasks?

While computer vision tasks cover a wide breadth of perception capabilities and the list continues to grow, the latest techniques support and help solve use cases involving detection, classification, segmentation, and image synthesis.

Detection tasks locate, and sometimes track, where an object exists in an image. For example, in healthcare for digital pathology, detection could involve identifying cancer cells through medical imaging. In robotics, software developers are using object detection to avoid obstacles on the factory floor.

Classification techniques determine what object exists within the visual data. For example, in manufacturing, an object recognition system classifies different types of bottles to package. In agriculture, farmers are using classification to identify weeds among their crops.

Segmentation tasks classify pixels belonging to a certain category, either individually by pixel (semantic image segmentation) or by assigning multiple object types of the same class as individual instances (instance image segmentation). For example, a self-driving car segments parts of a road scene as drivable and non-drivable space.

Image synthesis techniques create synthetic data by morphing existing digital images to contain desired content. Generative adversarial networks (GANs), such as EditGAN, enable generating synthetic visual information from text descriptions and existing images of landscapes and people. Using synthetic data to compliment and simulate real data is an emerging computer vision use case in logistics using vision AI for applications like smart inventory control.

What are the different types of computer vision?

To understand the different domains within computer vision, it is important to understand the techniques on which computer vision tasks are based. Most computer vision techniques begin with a model, or mathematical algorithm, that performs a specific elementary operation, task, or combination. While we classify traditional image processing and AI-based computer vision algorithms separately, most computer vision systems rely on a combination depending on the use case, complexity, and performance required.

Traditional computer vision

Traditional, non-deep learning-based computer vision can refer to both computer vision and image processing techniques.

In traditional computer vision, a specific set of instructions perform a specific task, like detecting corners or edges in an image to identify windows in an image of a building.

On the other hand, image processing performs a specific manipulation of an image that can be then used for further processing with a vision algorithm. For instance, you may want to smooth or compress an image’s pixels for display or reduce its overall size. This can be likened to bending the light that enters the eye to adjust focus or viewing field. Other examples of image processing include adjusting, converting, rescaling, and warping an input image.

AI-based computer vision

AI-based computer vision or vision AI relies on algorithms that have been trained on visual data to accomplish a specific task, as opposed to programmed, hard-coded instructions like that of image processing.

The detection, classification, segmentation, and synthesis tasks mentioned earlier typically are AI-based computer vision algorithms because of the accuracy and robustness that can be achieved. In many instances, AI-based computer vision algorithms can outperform traditional algorithms in terms of these two performance metrics.

AI-based computer vision algorithms mimic the human vision system more closely by learning from and adapting to visual data inputs, making them the computer vision models of choice in most cases. That being said, AI-based computer vision algorithms require large amounts of data and the quality of that data directly drives the quality of the model’s output. But, the performance outweighs the cost.

AI-based neural networks teach themselves, depending on the data the algorithm was trained on. AI-based computer vision is like learning from experience and making predictions based on context apart from explicit direction. The learning process is akin to when your eye sees an unfamiliar object and the brain tries to learn what it is and stores it for future predictions.

Machine learning compared to deep learning in AI-based computer vision

Machine learning computer vision is a type of AI-based computer vision. AI-based computer vision based on machine learning has artificial neural networks or layers, similar to that seen in the human brain, to connect and transmit signals about the visual data ingested. In machine learning, computer vision neural networks have separate and distinct layers, explicitly-defined connections between the layers, and predefined directions for visual data transmission.

Deep learning-based computer vision models are a subset of machine learning-based computer vision. The “deep” in deep learning derives its name from the depth or number of the layers in the neural network. Typically, a neural network with three or more layers is considered deep.

AI-based computer vision based on deep learning is trained on volumes of data. It is not uncommon to see hundreds of thousands and millions of digital images used to train and develop deep neural network models. For more information, see What’s the difference Between Artificial Intelligence, Machine Learning, and Deep Learning?.

Get started developing computer vision

Now that we have covered the fundamentals of computer vision, we encourage you to get started developing computer vision. We recommend that beginners get started with the Vision Programming Interface (VPI) Computer Vision and Image Processing Library for non-AI algorithms or one of the TAO Toolkit fully-operational, ready-to-use, pretrained AI models.

Learn more

To see how NVIDIA enables the end-to-end computer vision workflow, see the Computer Vision Solutions page. NVIDIA provides models plus computer vision and image-processing tools. We also provide AI-based software application frameworks for training visual data, testing and evaluation of image datasets, deployment and execution, and scaling.

To help enable emerging computer vision developers everywhere, NVIDIA is curating a series of paths to mastery to chart and nurture next-generation leaders. Stay tuned for the upcoming release of the computer vision path to mastery to self-pace your learning journey and showcase your #NVCV progress on social media.

Misc

[TF1] Confirming thatmodel restored checkpoints successfully

Post author By
Post date March 20, 2022
No Comments on [TF1] Confirming thatmodel restored checkpoints successfully

I have a piece of TF1 code that I got to work, and am implementing checkpoint saving and restoration using Tf.train.Saver()

After fixing a few bugs, seems to run OK, but I would like to confirm that the restoration was successfully implemented.

The lines in question

saver = tf.train.import_meta_graph(“/root/saved_checkpoints/model.ckpt.meta”) saver.restore(sess, tf.train.latest_checkpoint(‘/root/saved_checkpoints’))

I tried using assert_consumed() after the second line, but it returned a nonetype error. Is there any other way to confirm?

submitted by /u/dxjustice
[visit reddit] [comments]

Misc

Custom Image classification, where to find dataset

Post author By
Post date March 20, 2022
No Comments on Custom Image classification, where to find dataset

Building an app for school, we chose to identify fish species. I manually downloaded images for 2 species but it took forever. Do you guys know of any site that has a bunch of pictures of a specific fish species? I know it’s a long shot but it’s worth asking. thanks!

submitted by /u/IraqiGawad
[visit reddit] [comments]

Misc

What is the tensorflow library you have the most experience with

Post author By
Post date March 19, 2022
No Comments on What is the tensorflow library you have the most experience with

Hello, I am planning a project with tensorflow and looking through the options provided in directions to take the project there are a number of considerations. I have done some basic guided projects, though am looking to create a complete study from start to finish on my own. I am interested in finding out from anyone who has completed their own research projects their favorite or preferred libraries. Any ideas or information is greatly appreciated, thank you.

submitted by /u/air_study
[visit reddit] [comments]

Misc

How can I use an ImageFolder dataset in training?

Post author By
Post date March 19, 2022
No Comments on How can I use an ImageFolder dataset in training?

I am attempting to create an image classifier with VGG16 transfer learning using the ImageFolder dataset builder. The data is successfully built from a directory using this code:

def dataset_creator(directory=""): builder = tfds.folder_dataset.ImageFolder("/home/blabs/Documents/CompCore/visSystems/affect2/",shape=(224,224,1)) ds = builder.as_dataset(split='train', shuffle_files=True) # tfds.show_examples(ds, builder.info) # print(type(ds)) return ds

However, when I try to input the dataset into the training script using this code:

ds = dataset_creator() num_classes = 8 base_model = VGG16(weights="imagenet", include_top=False, input_shape=(224,224,3)) base_model.trainable = False normalization_layer = layers.Rescaling(scale=1./127.5, offset=-1) flatten_layer = layers.Flatten() dense_layer_1 = layers.Dense(160, activation='relu') dense_layer_2 = layers.Dense(80, activation='relu') prediction_layer = layers.Dense(num_classes, activation='softmax') model = models.Sequential([ base_model, normalization_layer, flatten_layer, dense_layer_1, dense_layer_2, prediction_layer ]) from tensorflow.keras.callbacks import EarlyStopping model.compile( optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'], ) es = EarlyStopping(monitor='val_accuracy', mode='max', patience=5, restore_best_weights=True) model.fit(ds, epochs=10, callbacks=[es]) model.save(directory)

It results in the following ValueError:

ValueError: Missing data for input "vgg16_input". You passed a data dictionary with keys ['image', 'image/filename', 'label']. Expected the following keys: ['vgg16_input']

Due to this error, I have attempted to index the ds variable with [‘image’] and [‘label’], but to no avail. How can I proceed in using the Image_Folder dataset as training material for my VGG16 transfer learning CNN?

If the answer is obvious, please forgive me. This is my first time attempting to create a CNN with Tensorflow.

submitted by /u/blevlabs
[visit reddit] [comments]

Misc

Accelerating Lossless GPU Compression with New Flexible Interfaces in NVIDIA nvCOMP

Post author By
Post date March 18, 2022
No Comments on Accelerating Lossless GPU Compression with New Flexible Interfaces in NVIDIA nvCOMP

Use the high-level nvCOMP API for easy compression and decompression and the low-level API for more advanced workflows.

Compression can improve performance in a variety of use cases such as DL workloads, databases, and general HPC. On the GPU, compression can accelerate inter-GPU communications for collaborative workflows. It can increase the size of datasets that a single GPU can handle by compressing data before it’s stored to global memory. It can also accelerate the data link between the CPU and GPU.

For any of these workflows to be beneficial, compression and decompression must be fast and operate at a high enough compression ratio on a given dataset to be useful. However, compression ratios and throughputs of different algorithms vary widely from dataset to dataset. It can be difficult to select the best one without a lot of specialized knowledge about the algorithms and data statistics.

The NVIDIA nvCOMP library enables you to incorporate high-performance GPU compression and decompression in your applications. The library provides a set of unified APIs that allow you to quickly swap compression formats to achieve best performance on your datasets with minimal changes to code.

With nvCOMP, you can quickly and easily experiment with different algorithms to find the one with the best performance for your use case. In recent releases, we’ve updated nvCOMP to further improve and unify the interfaces. As of the newly released version 2.2, we provide an easy-to-use, high-level C++ API and a versatile low-level batch C API. In this post, we cover both interfaces in detail. You also learn how to use them effectively and when you should choose one over the other.

High-level API

The high-level API is easier to use and abstracts the work of exposing parallelism to the GPU. It is most useful when you have to compress a contiguous buffer into a contiguous, compressed buffer. This works well, for example, when compressing a buffer before sending it over a network or saving it to disk.

The following examples use the high throughput GDeflate compression format. GDeflate is deflate-like and can be mapped efficiently to data parallel architectures, such as GPUs. It is a good starting point if you that don’t have constraints on the compression format to use.

The high-level interface is a C++ API based on the nvcompManagerBase class hierarchy. Each derived Manager class is declared in its associated header in nvcomp/include. For example, the GDeflateManager used in this post is declared in nvcomp/include/gdeflate.hpp.

To get started, construct the desired Manager class. Each Manager constructor has a unique set of arguments; however, a few arguments are generally shared. All subclasses allow construction with a specified stream ID to use for all kernels and memory transfers. You can also specify the device ID to use. If you don’t specify values for these two arguments, the default stream and device are used.

Another common input is the uncompressed chunk size. This is used during compression to split the buffer into independent chunks for processing. Larger chunk sizes typically lead to higher compression ratios at the expense of less parallelism exposed to the GPU. A good starting chunk size is 64 KB, but feel free to experiment with these values to explore the associated tradeoffs for your datasets.

The Manager classes are also constructed with format-specific arguments. You can check the associated header in nvcomp/include for a description of the arguments to the Manager class constructor and to see how to construct the Manager object for your chosen format.

const size_t uncomp_chunk_size = 64 * 1024;
 
cudaStream_t stream;
cudaStreamCreate(&stream));
const int gdeflate_algorithm = 0; // Use standard GDeflate
const int device_id = 0; // Use the default device
 
GdeflateManager gdeflate_manager{chunk_size, gdeflate_algorithm, stream, device_id};

nvcompManager requires a temporary scratch workspace to do compression and decompression. This required scratch space is of fixed size based on the particular compression format arguments and the maximum occupancy of the compression and decompression kernels. If it makes sense for your use case, you can provide a scratch buffer to the nvcompManager object after construction, using set_scratch_buffer.

size_t scratch_buffer_size = gdeflate_manager.get_required_scratch_buffer_size();
uint8_t* scratch_buffer;
cudaMalloc(&scratch_buffer, scratch_buffer_size);
gdeflate_manager.set_scratch_buffer(scratch_buffer);

Manually setting the scratch buffer may be desirable to control the memory allocation scheme used for this allocation. If you’re OK with the default, we suggest skipping this step and enabling the nvcompManager object to handle the allocation.

This buffer is reused for all compression and decompression operations that nvcompManager performs. If the nvcompManager object allocates the scratch buffer, it is freed when the object is destroyed.

Compression

Now you’re ready to compress a buffer. First, configure the compression using the configure_compression API. This asynchronous operation returns a CompressionConfig object.

The configuration step only requires the size of the input-uncompressed buffer. You must allocate a GPU-accessible memory buffer of at least this size to serve as the result buffer for the compression routine. With this information, compression can be performed, as shown in the following code example:

CompressionConfig comp_config = gdeflate_manager.configure_compression(input_buffer_len);

uint8_t* comp_buffer;
cudaMallocAsync(&comp_buffer, comp_config.max_compressed_buffer_size, stream);
 
gdeflate_manager.compress(uncomp_buffer, comp_buffer, comp_config);

You can also queue up additional compressions on the GPU.

uint8_t* comp_buffer1, comp_buffer2;
CompressionConfig comp_config1 = gdeflate_manager.configure_compression(input_buffer_len1);

cudaMallocAsync(&comp_buffer1, comp_config1.max_compressed_buffer_size, stream);
 
gdeflate_manager.compress(uncomp_buffer1, comp_buffer1, comp_config1);

CompressionConfig comp_config2 = gdeflate_manager.configure_compression(input_buffer_len2);

cudaMallocAsync(&comp_buffer2, comp_config2.max_compressed_buffer_size, stream);
 
gdeflate_manager.compress(uncomp_buffer2, comp_buffer2, comp_config2);

cudaStreamSynchronize(stream);

Decompression

The buffer that results from high-level interface compression includes a header before the compressed data (Figure 1). This header includes information about how the buffer was compressed, so that you can construct an nvcompManager object from a compressed buffer without knowing how it was compressed. This enables you to decompress a buffer without knowing how it was compressed.

A diagram showing an example nvCOMP HLIF-compressed buffer — *Figure 1. HLIF compressed data format*

To do this, use the create_manager API declared in nvcompManagerFactory.hpp. This synchronous API takes as input the compressed buffer along with optional stream and device IDs.

auto decomp_nvcomp_manager = create_manager(comp_buffer, stream);

If you already have the information about how the buffer was compressed, you can construct a new manager using that configuration as described earlier. You can also reuse the same nvcompManager object that was used for compression to perform decompression. These approaches have the advantage that they don’t require synchronizing the stream.

Given an nvcompManager object and a compressed buffer, decompression is performed similarly to compression with a couple of minor differences. For one, there are two possible ways to do the decompression configuration. If you have the CompressionConfig object used for the compression, you can configure the decompression completely asynchronously.

DecompressionConfig decomp_config = gdeflate_manager->configure_decompression(comp_config);

One example use case for this API is in the training of large neural networks. The size of the neural network or the size of the training set that you can use is limited based on the memory capacity of the GPU. Using compression, you can effectively increase this capacity without having to offload data to the CPU or use multiple GPUs.

Specifically, backpropagation-based training involves computing activation maps during the forward pass and then reusing them in the computation of the backward pass. These activation maps are large and relatively sparse, making them good fits for compression. Use the gdeflate_manager to compress the maps and hold in memory the compressed buffers and the CompressionConfig objects from each layer of the network. This enables fully asynchronous backpropagation, including decompression.

You can also configure the decompression using the compressed buffer if you don’t have the CompressionConfig object that was used. This is a synchronous operation that must perform a cudaMemcpyAsync operation from the device. All synchronization is on the stream specified in the nvcompManager constructor and is not device-wide.

DecompressionConfig decomp_config = gdeflate_manager->configure_decompression(comp_buffer);

As with compression, you can queue many decompression items at one time before synchronizing the stream.

uint8_t* res_decomp_buffer1, res_decomp_buffer2;
DecompressionConfig decomp_config1 = gdeflate_manager->configure_decompression(comp_config1);
DecompressionConfig decomp_config2 = gdeflate_manager->configure_decompression(comp_config2);

cudaMallocAsync(&res_decomp_buffer1, decomp_config1.decomp_data_size, stream);
cudaMallocAsync(&res_decomp_buffer2, decomp_config2.decomp_data_size, stream);

gdeflate_manager->decompress(res_decomp_buffer1, comp_buffer1, decomp_config1);
gdeflate_manager->decompress(res_decomp_buffer2, comp_buffer2, decomp_config2);

cudaStreamSynchronize(stream));

Finally, there are two types of error checking in the high-level API: std::runtime_error exceptions and checking the nvcompStatus_t value.

If any CUDA APIs fail, these raise std::runtime_error exceptions. You can catch these in your application or leave them unhandled, in which case your application fails with a descriptive error message of what went wrong. This can happen if, for example, the output buffer that you provided was of insufficient size or wasn’t accessible on the GPU.

The second form of error-checking is to check the nvcompStatus_t value in the CompressionConfig or DecompressionConfig object. This status is set during the associated kernel call. Corrupt input buffers and other errors trigger it.

Low-level API

The low-level API provides a C API for more advanced workflows. The low-level API simultaneously compresses and decompresses batches of independent chunks that you provided. It’s up to you to chunk the data and to provide a sufficient number of chunks to exploit the GPU’s parallel processing capabilities.

This is the most efficient way to process the data if you have many independent, discontiguous buffers. The low-level API avoids the workload of packing the resulting compressed chunks into a single contiguous-compressed buffer. It also avoids the compression ratio overhead associated with saving information about how the buffer was compressed as in the high-level API.

This workflow fits well with database applications, for example, where you tend to have many independent columns to compress or decompress. This API is used in RAPIDS and in the NVIDIA Spark implementation.

Compression

For compression in the low-level API, you must allocate a temporary scratch buffer. The temporary buffer is similar to that described in the high-level API. However, the buffer size is dependent on the size of the input buffer so it must be redefined and possibly reallocated with each new set of user inputs.

size_t temp_bytes;
nvcompBatchedGdeflateCompressGetTempSize(batch_size, chunk_size, nvcompBatchedGdeflateDefaultOpts, &temp_bytes);

void* device_temp_ptr;
cudaMalloc(&device_temp_ptr, temp_bytes);

Next, the maximum size of a compressed chunk in the batch should be computed. This allows you to allocate a collection of result buffers. In the following example, batch_size is the number of chunks to process. The device array of result pointers is constructed in pinned host memory before copying to the device.

size_t max_out_bytes; nvcompBatchedGdeflateCompressGetMaxOutputChunkSize(chunk_size, nvcompBatchedGdeflateDefaultOpts, &max_out_bytes); // Allocate output space on the device void ** host_compressed_ptrs; cudaMallocHost((void**)&host_compressed_ptrs, sizeof(size_t) * batch_size); for(size_t ix_chunk = 0; ix_chunk

With all these inputs computed, you can now do compression asynchronously as shown.

nvcompStatus_t comp_res = nvcompBatchedGdeflateCompressAsync(  
      device_uncompressed_ptrs,    
      device_uncompressed_bytes,  
      chunk_size, 
      batch_size,  
      device_temp_ptr,  
      temp_bytes,  
      device_compressed_ptrs,  
      device_compressed_bytes,  
      nvcompBatchedGdeflateDefaultOpts,

Decompression

To begin work towards decompression, pre-compute the decompressed sizes based on the compressed buffer. If you already have this information, skip this step.

nvcompBatchedGdeflateGetDecompressSizeAsync(
      device_compressed_ptrs,
      device_compressed_bytes,
      device_uncompressed_bytes,
      batch_size,
      stream);

Similar to compression, you must also compute the required temporary size and allocate a temporary scratch buffer.

size_t decomp_temp_bytes;
nvcompBatchedGdeflateDecompressGetTempSize(batch_size, chunk_size, &decomp_temp_bytes);
void * device_decomp_temp;
cudaMalloc(&device_decomp_temp, decomp_temp_bytes);

Finally, you can do the asynchronous decompression.

nvcompStatus_t decomp_res = nvcompBatchedGdeflateDecompressAsync(
      device_compressed_ptrs,
      device_compressed_bytes,
      device_uncompressed_bytes,
      device_actual_uncompressed_bytes,
      batch_size,
      device_decomp_temp,
      decomp_temp_bytes,
      device_uncompressed_ptrs,
      device_statuses,
      stream);

Benchmarking

nvCOMP provides a set of benchmarks for each of the formats in the low-level and high-level format. Figure 2 compares the performance of high-level and low-level on a few different datasets, with large contiguous buffers. The results were collected using the A100 GPU.

Bar chart shows decompression throughputs on different datasets. The high-level performance nearly matches the low-level performance. — *Figure 2a. Decompression throughputs for various datasets.*

Bar chart shows compression ratios on different datasets. The high-level performance nearly matches the low-level performance. — *Figure 2a. Decompression throughputs for various datasets.*

As you can see from the results, the difference in performance between the low– and high-level APIs is negligible when working with large contiguous buffers. The choice of which to use then comes down to your use case. Use the low-level API if you have many small buffers or to avoid the memory footprint associated with the high-level API.

Figure 3 shows performance across different buffer sizes in log-scale. To produce these results, the mortgage-int dataset presented as part of Figure 2 was split into many batches of batchSize as shown. The file is over 314 MB. For the 1 MB batch size, 315 compression and decompression operations are performed. At a 400 MB batch size, a single compression and decompression operation is performed.

Batching the data in this way doesn’t affect the low-level batch API.

a bar chart showing high-level compression throughput at various batch sizes, compared to the low-level performance. The high-level performance suffers for smaller batch sizes. — *Figure 3a: Compression throughputs for various batch sizes operating on a 314 MB file.*

a bar chart showing high-level decompression throughput at various batch sizes, compared to the low-level performance. The high-level performance suffers for smaller batch sizes. — *Figure 3a: Compression throughputs for various batch sizes operating on a 314 MB file.*

As demonstrated, the performance of the high-level interface degrades heavily for small batch sizes. This shows the utility of using the low-level batch API when compressing or decompressing many smaller buffers. The low-level batch API can do the operations using fewer, higher-occupancy kernels, while the high-level API requires many small kernel launches with associated tail effects and occupancy concerns.

We include benchmark applications with the library so that you can try out different compression formats and see which works best on your data. The provided benchmarks are benchmark_hlif and benchmark__chunked. For more information, see the nvCOMP README.

Summary

Now you’ve learned how to use the high-level nvCOMP API for easy compression and decompression. You’ve learned when it may be better to use the low-level API as well as how to use it.

For more information, see the latest version of the NVIDIA/nvcomp GitHub repo. For fully worked, compilable examples that you can adapt to your use cases, see the lowlevel_c_quickstart.md and highlevel_cpp_quickstart.md walkthroughs along with the associated example files.

If you have any questions, please comment below. You can also join us at the Connect with the Experts: nvCOMP: GPU Compression/Decompression GTC session on Monday, March 21 at 10AM PT.

Misc

need some help in our collage 🤞 project its 80% is completed

Post author By
Post date March 18, 2022
No Comments on need some help in our collage 🤞 project its 80% is completed

this is an NLP and python based project we are trying to achieve something new this project is almost there but the only file connecting thing is leftover😌

please dm me for more information/collaboration😊

we are open to welcoming you to this project🤗

not a paid work🙄

submitted by /u/fit-tube
[visit reddit] [comments]

Misc

need a little help

hey everyone so i have been experimenting with object detection using python,opencv and tensorflow but i keep getting this error P.S both the code an “myData” are in the same folder

the code:

import numpy as np import matplotlib.pyplot as plt from keras.models import Sequential from keras.layers import Dense from tensorflow.keras.optimizers import Adam from keras.utils.np_utils import to_categorical from keras.layers import Dropout, Flatten from keras.layers.convolutional import Conv2D, MaxPooling2D import cv2 from sklearn.model_selection import train_test_split import pickle import os import pandas as pd import random from keras.preprocessing.image import ImageDataGenerator

########### Parameters

path = “myData” # folder with all the class folders labelFile = ‘labels.csv’ # file with all names of classes batch_size_val = 50 # how many to process together steps_per_epoch_val = 2000 epochs_val = 10 imageDimesions = (32, 32, 3) testRatio = 0.2 # if 1000 images split will 200 for testing validationRatio = 0.2 # if 1000 images 20% of remaining 800 will be 160 for validation

######################### Importing of the Images

count = 0 images = [] classNo = [] myList = os.listdir(path) print(“Total Classes Detected:”, len(myList)) noOfClasses = len(myList) print(“Importing Classes…..”) for x in range(0, len(myList)): myPicList = os.listdir(path + “/” + str(count)) for y in myPicList: curImg = cv2.imread(path + “/” + str(count) + “/” + y) images.append(curImg) classNo.append(count) print(count, end=” “) count += 1 print(” “) images = np.array(images) classNo = np.array(classNo)

######################### Split Data

X_train, X_test, y_train, y_test = train_test_split(images, classNo, test_size=testRatio) X_train, X_validation, y_train, y_validation = train_test_split(X_train, y_train, test_size=validationRatio)

X_train = ARRAY OF IMAGES TO TRAIN y_train = CORRESPONDING CLASS ID ######################### TO CHECK IF NUMBER OF IMAGES MATCHES TO NUMBER OF LABELS FOR EACH DATA SET

print(“Data Shapes”) print(“Train”, end=””); print(X_train.shape, y_train.shape) print(“Validation”, end=””); print(X_validation.shape, y_validation.shape) print(“Test”, end=””); print(X_test.shape, y_test.shape) assert (X_train.shape[0] == y_train.shape[ 0]), “The number of images in not equal to the number of lables in training set” assert (X_validation.shape[0] == y_validation.shape[ 0]), “The number of images in not equal to the number of lables in validation set” assert (X_test.shape[0] == y_test.shape[0]), “The number of images in not equal to the number of lables in test set” assert (X_train.shape[1:] == (imageDimesions)), ” The dimesions of the Training images are wrong ” assert (X_validation.shape[1:] == (imageDimesions)), ” The dimesionas of the Validation images are wrong ” assert (X_test.shape[1:] == (imageDimesions)), ” The dimesionas of the Test images are wrong”

######################### READ CSV FILE

data = pd.read_csv(labelFile) print(“data shape “, data.shape, type(data))

######################### DISPLAY SOME SAMPLES IMAGES OF ALL THE CLASSES

num_of_samples = [] cols = 5 num_classes = noOfClasses fig, axs = plt.subplots(nrows=num_classes, ncols=cols, figsize=(5, 300)) fig.tight_layout() for i in range(cols): for j, row in data.iterrows(): x_selected = X_train[y_train == j] axs[j][i].imshow(x_selected[random.randint(0, len(x_selected) – 1), :, :], cmap=plt.get_cmap(“gray”)) axs[j][i].axis(“off”) if i == 2: axs[j][i].set_title(str(j) + “-” + row[“Name”]) num_of_samples.append(len(x_selected))

######################### DISPLAY A BAR CHART SHOWING NO OF SAMPLES FOR EACH CATEGORY

print(num_of_samples) plt.figure(figsize=(12, 4)) plt.bar(range(0, num_classes), num_of_samples) plt.title(“Distribution of the training dataset”) plt.xlabel(“Class number”) plt.ylabel(“Number of images”) plt.show()

######################### PREPROCESSING THE IMAGES

def grayscale(img): img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) return img

def equalize(img): img = cv2.equalizeHist(img) return img

def preprocessing(img): img = grayscale(img) # CONVERT TO GRAYSCALE img = equalize(img) # STANDARDIZE THE LIGHTING IN AN IMAGE img = img / 255 # TO NORMALIZE VALUES BETWEEN 0 AND 1 INSTEAD OF 0 TO 255 return img

X_train = np.array(list(map(preprocessing, X_train))) # TO IRETATE AND PREPROCESS ALL IMAGES X_validation = np.array(list(map(preprocessing, X_validation))) X_test = np.array(list(map(preprocessing, X_test))) cv2.imshow(“GrayScale Images”, X_train[random.randint(0, len(X_train) – 1)]) # TO CHECK IF THE TRAINING IS DONE PROPERLY

######################### ADD A DEPTH OF 1

X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], X_train.shape[2], 1) X_validation = X_validation.reshape(X_validation.shape[0], X_validation.shape[1], X_validation.shape[2], 1) X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)

######################### AUGMENTATAION OF IMAGES: TO MAKEIT MORE GENERIC

dataGen = ImageDataGenerator(width_shift_range=0.1, # 0.1 = 10% IF MORE THAN 1 E.G 10 THEN IT REFFERS TO NO. OF PIXELS EG 10 PIXELS height_shift_range=0.1, zoom_range=0.2, # 0.2 MEANS CAN GO FROM 0.8 TO 1.2 shear_range=0.1, # MAGNITUDE OF SHEAR ANGLE rotation_range=10) # DEGREES dataGen.fit(X_train) batches = dataGen.flow(X_train, y_train, batch_size=20) # REQUESTING DATA GENRATOR TO GENERATE IMAGES BATCH SIZE = NO. OF IMAGES CREAED EACH TIME ITS CALLED X_batch, y_batch = next(batches)

TO SHOW AGMENTED IMAGE SAMPLES

fig, axs = plt.subplots(1, 15, figsize=(20, 5)) fig.tight_layout()

the error: myPicList = os.listdir(path + “/” + str(count)) FileNotFoundError: [WinError 3] The system cannot find the path specified: ‘myData/0’

submitted by /u/Sufficient-Try8159
[visit reddit] [comments]

Misc

Hopped Up: NVIDIA CEO, AI Leaders to Discuss Next Wave of AI at GTC

Post author By
Post date March 18, 2022
No Comments on Hopped Up: NVIDIA CEO, AI Leaders to Discuss Next Wave of AI at GTC

NVIDIA’s GTC conference is packed with smart people and programming. The virtual gathering — which takes place from March 21-24 — sits at the intersection of some of the fastest-moving technologies of our time. It features a lineup of speakers from every corner of industry, academia and research who are ready to paint a high-definition Read article >

The post Hopped Up: NVIDIA CEO, AI Leaders to Discuss Next Wave of AI at GTC appeared first on NVIDIA Blog.