Hello, I want to build a CNN with TensorFlow, I want to load the data with image_dataset_from_directory, and I have the labels, a list of numbers from 0 to 3, so I expect to TensorFlow tell me that it found N images and 4 classes, but I show me that it found 321 classes.
The labels list is like: [0, 3, 1, 1, … , 2, 0, 0]
So, I don’t know if I should modify the list format o distribution, or add another parameter in image_dataset_from_directory, if someone can help me please 🙁
So I started flowing the TensorFlow pip install guide however when it comes to actually checking if it can see the GPU it always comes back with this
(tf) C:UsersShain>python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" 2022-06-01 21:27:36.605396: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2022-06-01 21:27:36.605523: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.base has been moved to tensorflow.python.trackable.base. The old module will be deleted in version 2.11. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.checkpoint_management has been moved to tensorflow.python.checkpoint.checkpoint_management. The old module will be deleted in version 2.9. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.resource has been moved to tensorflow.python.trackable.resource. The old module will be deleted in version 2.11. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.util has been moved to tensorflow.python.checkpoint.checkpoint. The old module will be deleted in version 2.11. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.base_delegate has been moved to tensorflow.python.trackable.base_delegate. The old module will be deleted in version 2.11. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.graph_view has been moved to tensorflow.python.checkpoint.graph_view. The old module will be deleted in version 2.11. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.python_state has been moved to tensorflow.python.trackable.python_state. The old module will be deleted in version 2.11. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.saving.functional_saver has been moved to tensorflow.python.checkpoint.functional_saver. The old module will be deleted in version 2.11. WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.saving.checkpoint_options has been moved to tensorflow.python.checkpoint.checkpoint_options. The old module will be deleted in version 2.11. 2022-06-01 21:27:38.504483: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2022-06-01 21:27:38.504803: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found 2022-06-01 21:27:38.507542: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found 2022-06-01 21:27:38.508572: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found 2022-06-01 21:27:38.509170: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found 2022-06-01 21:27:38.509519: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusolver64_11.dll'; dlerror: cusolver64_11.dll not found 2022-06-01 21:27:38.509902: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found 2022-06-01 21:27:38.510368: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found 2022-06-01 21:27:38.510665: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1867] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... 2022-06-01 21:27:38.511499: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. tf.Tensor(1560.4835, shape=(), dtype=float32)
I’ve tried setting an environment variable in the anaconda environment via
conda env config vars set EnvironmentBin=%CONDA_PREFIX%Librarybin
to see if it would help point tensor flow to the DLLs but that didn’t work either
not entirely sure what I am meant to do from here.
Using Numba and PyOptiX, NVIIDA enables you to configure ray tracing pipeline and write kernels in Python compatible with the OptiX pipeline.
Ray tracing is a rendering algorithm that can generate photorealistic images by simulating how light transmits and interacts with different materials. Today, it is widely adopted to bring imagery to life in game development, film-making, and physics simulations.
However, the ray-tracing algorithm is computationally intensive and requires hardware acceleration on the GPU to achieve real-time performance.
To leverage the hardware power for ray tracing, various toolchains and languages were invented to suit the need, such as openGL and the shading language.
Often, the build process of these software toolchains poses significant challenges to Python developers. To alleviate the difficulty and provide a familiar environment for writing ray tracing kernels, NVIDIA developed the Numba extension for PyOptiX. This extension enables graphics researchers and application developers to reduce the time from idea to implementation and shorten the development cycle on each iteration.
In this post, I provide an overview of the NVIDIA ray-tracing engine PyOptiX and explain how the Python JIT compiler, Numba, accelerates Python code. Finally, with a complete ray tracing example, I walk you through the steps of using the Numba extension for PyOptiX and write an accelerated ray tracing kernel in Python.
What are NVIDIA OptiX and PyOptiX?
NVIDIA RTX technology made ray tracing the default rendering algorithm in many modern rendering pipelines. As the demand for unique looks is unlimited, there’s a need for flexibility in customizing the rendering pipeline.
The NVIDIA RTX ray-tracing pipeline is customizable. By configuring how light transmits, reflects, and refracts on various materials, you can achieve distinctive looks on objects, such as shiny, glossy, or semi-transparent. By configuring how light rays are generated, you change the field of view and perspective of the look accordingly.
To address this need, NVIDIA developed NVIDIA OptiX, a ray-tracing engine that enables you to configure a hardware-accelerated ray-tracing pipeline. PyOptiX is the NVIDIA OptiX Python interface. This interface offers the capability for Python developers to have the same capabilities as NVIDIA OptiX developers who write in C++.
Kernel functions
To customize image facets, you use kernel functions, also referred to as kernel methods or kernels. You can think of kernels as a group of algorithms that transform data inputs to the required form. Native NVIDIA OptiX developers can write kernels with CUDA. With a Numba extension, you can write ray-tracing kernels in Python.
Higher performance with Numba and Numba.cuda
Ray tracing is a compute-intensive algorithm. While it is theoretically possible to run the ray-tracing kernel with the standard CPython interpreter, it would take days to render a regular ray-traced image. Moreover, NVIDIA OptiX requires the kernel to be runnable on a GPU device so that it integrates with the rest of the rendering pipeline.
Using Numba, a just-in-time Python function compiler, you can execute and accelerate your Python ray-tracing kernels with GPU hardware. Numba parses the Python function code and converts it to efficient machine code. On a high level, this process is divided into seven steps:
The function’s byte code is generated with the bytecode compiler.
The bytecode is analyzed. The control flow graph (CFG) and the data flow graph (DFG) are generated.
With bytecode, CFG, and DFG, the Numba intermediate representation (IR) is generated.
Based on the type of function inputs, the type is inferred for each IR variable.
The Numba IR is rewritten and gets Python-specific optimization.
The Numba IR is lowered to the LLVM IR, and more general optimization is performed.
The LLVM IR is consumed by the LLVM backend and optimized GPU machine code is generated.
Figure 1. A high-level view of Numba’s compilation pipeline
Because Numba can convert any Python functions into native code, in a Numba CUDA kernel, Python users have equal power as if they are writing the kernel in native CUDA. This code shows a dot product that’s executable on the device. For more information, see Numba Examples.
Introducing the Numba extension for PyOptiX
To customize specific stages of the ray-tracing pipeline, you must translate the Numba kernel into something that can be understood by the NVIDIA OptiX engine. NVIDIA developed the Numba Extension for PyOptiX to achieve this goal.
The extension includes custom type definition and intrinsic function lowerings. NVIDIA OptiX comes with a set of internal types:
OptixTraversableHandle
OptixVisibilityMask
SbtDataPointer
Functions such as optix.Trace
For Numba to perform type inference on these new types and methods, you must register these types and provide an implementation of these methods before compiling the user kernel. Currently, NVIDIA is expanding supported types and intrinsics to add more examples.
By exposing these types and intrinsics to Numba, you can now write kernels, which not only target the GPU but can specifically target the GPU for ray-tracing kernels. In combination with Numba CUDA, you can write ray-tracing kernels of equal power as if you were writing native CUDA ray-tracing kernels for NVIDIA OptiX.
In the next section, I introduce a Hello World example with the PyOptiX Numba extension. Before that, let me quickly go over some ray-tracing algorithm basics.
Fundamentals of ray tracing
Imagine that you use a camera to capture an image. The light source in the scene emits light rays, which travel in a straight line. When a light ray hits an object, it is reflected from the surface and eventually reaches the camera sensor.
From a high level, a ray-tracing algorithm walks through all rays that reach the image plane to identify in the scene where and what the ray intersects with. When the intersection point is found, you can adopt various shading techniques to determine the color of the intersected point. However, there are also rays that don’t hit anything in the scene. In this case, these rays are considered as “missing” the target.
Steps for ray tracing a triangle with the Numba extension for PyOptiX
In the following example, I show how the Numba extension for PyOptiX can help you write custom kernels to define the ray behavior at ray generation, ray hit, and ray miss.
Scene setup
I modeled the view you see as an image plane, which usually sits slightly in front of the camera. The camera is modeled as a point and a set of mutually orthogonal vectors in the 3D space.
Figure 2. Scene setup for the triangle rendering example
Camera
The camera is modeled as a point in three dimensions. The three vectors, U, V, and W, are used to show the sideways, upwards, and frontal directions of the camera. This uniquely determines the position and orientation of the camera.
To simplify the computation for ray generation later, the U and V vectors are not unit vectors. Instead, their lengths proportionally match the image’s aspect ratio. Lastly, the length of the W vector is the distance between the camera and the image plane.
Ray generation kernel
The ray generation kernel is the centerpiece of the algorithm. Ray origins and directions are generated here and then passed down to the trace call. Its intensity is retrieved from other kernels and written as image data. In this section, I discuss the methods used to generate rays in this kernel.
With the camera and the image plane, you can generate the rays. Adopt a coordinate system convention where the center of the image is the origin. The sign of a coordinate in an image pixel shows its relative position to the origin and its magnitude shows the distance. With this property, multiply the camera’s U and V vector with the corresponding elements of the pixel position and add them together. The result is a vector that points to the pixel from the image center.
Finally, add this vector to the W or front vector, and this generates a ray that originates at the camera position and goes through the pixel on the image plane. Figure 3 shows the decomposition of a ray that originates from the camera and goes through point (x, y) in the image plane.
Figure 3. Decomposition of a ray that goes through pixel (x, y)
In code, the pixel index and image dimension of the image plane can be retrieved using two optix intrinsic functions optix.GetLaunchIndex and optix.GetLaunchDimensions. Next, the pixel index is normalized to [-1.0, 1.0]. The following code example shows this logic in the Numba CUDA kernel.
@cuda.jit(device=True, fast_math=True)
defcomputeRay(idx, dim):
U = params.cam_u
V = params.cam_v
W = params.cam_w
# Normalizing coordinates to [-1.0, 1.0]
d = float32(2.0) * make_float2(
float32(idx.x) / float32(dim.x), float32(idx.y) / float32(dim.y)
) - float32(1.0)
origin = params.cam_eye
direction = normalize(d.x * U + d.y * V + W)
return origin, direction
def __raygen__rg():
# Look up your location within the launch grid
idx = optix.GetLaunchIndex()
dim = optix.GetLaunchDimensions()
# Map your launch idx to a screen location and create a ray from the camera
# location through the screen
ray_origin, ray_direction = computeRay(make_uint3(idx.x, idx.y, 0), dim)
This code example shows the helper function of computeRay that computes the origin and direction vector of the ray.
Next, pass the generated ray into the intrinsic function optix.Trace. This initializes the ray tracing algorithm. The underlying optiX engine traverses through the primitives, computes the intersection point in the scene, and finally returns the intensity of the ray. The following code example shows the call to optix.Trace.
# In __raygen__rg
payload_pack = optix.Trace(
params.handle,
ray_origin,
ray_direction,
float32(0.0), # Min intersection distance
float32(1e16), # Max intersection distance
float32(0.0), # rayTime -- used for motion blur
OptixVisibilityMask(255), # Specify always visible
uint32(OPTIX_RAY_FLAG_NONE),
uint32(0), # SBT offset -- Refer to OptiX Manual for SBT
uint32(1), # SBT stride -- Refer to OptiX Manual for SBT
uint32(0), # missSBTIndex -- Refer to OptiX Manual for SBT
)
Ray hit kernel
In the ray hit kernel, you write code to determine the intensity of each channel of the light ray. If the triangle vertices are set up using the NVIDIA OptiX internal data structure, then you can call the NVIDIA OptiX intrinsic optix.GetTriangleBarycentrics to retrieve the barycentric coordinates of the hit point.
To make the color more interesting, insert this coordinate into the color for that pixel. The blue channel of the color is set to 1.0. The intensity of the ray should be passed to the ray generation kernel for further post-processing and be written to the image.
NVIDIA OptiX shares data between the kernels through payload registers. Use the setPayload function to set the values of the payload registers to the ray intensities. By default, payload registers are integer types. Use the CUDA intrinsic float_as_int to interpret the float value as an integer, without changing the bits.
@cuda.jit(device=True, fast_math=True)
defsetPayload(p):
optix.SetPayload_0(float_as_int(p.x))
optix.SetPayload_1(float_as_int(p.y))
optix.SetPayload_2(float_as_int(p.z))
def__closesthit__ch():
# When a built-in triangle intersection is used, a number of fundamental
# attributes are provided by the NVIDIA OptiX API, including barycentric coordinates.
barycentrics = optix.GetTriangleBarycentrics()
setPayload(make_float3(barycentrics, float32(1.0)))
Ray miss kernel
The ray miss kernel sets the color of the rays that didn’t hit any objects in the scene. Here you set them to the background color.
bg_color is some data specified in the shader-binding table during the setup of the render pipeline. For now, just be aware that it’s a set of hard-coded floats representing the background color of the scene.
You have now defined the color for all rays. The color is retrieved in the ray generation kernel as a payload_pack data structure from the optix.trace call. Remember that in the ray hit and the ray miss kernel, you had to interpret the bits of the floating-point numbers into integers? Revert this step with the int_as_float function.
Now, you may directly write these values to the image and it would still look great. Take an extra step of performing post-processing steps to raw pixel values, which are important to great images in more complicated scenes.
The values that you have retrieved are simply raw intensities of the ray, which scale linearly to the energy level that the ray carries. While this fits your physical world’s model, the human eye does not respond to light stimuli in a linear fashion. Instead, it follows the mapping of input to respond by a power function.
To account for this, perform a gamma correction to the intensities. In addition, most users who are viewing the result of this image are watching a monitor with sRGB color space. Assume that the values from the ray-tracing world are in CIE-XYZ color space, and apply a color space conversion. Finally, perform quantization of the color values into 8-bit unsigned integers.
The following code example shows the helper functions for post-processing color intensities and writing them to the pixel array in the ray-generation kernel.
@cuda.jit(device=True, fast_math=True)
deftoSRGB(c):
# Use float32 for constants
invGamma = float32(1.0) / float32(2.4)
powed= make_float3(
fast_powf(c.x, invGamma),
fast_powf(c.y, invGamma),
fast_powf(c.z, invGamma),
)
return make_float3(
float32(12.92) * c.x
if c.x float32(0.0031308)
else float32(1.055) * powed.x - float32(0.055),
float32(12.92) * c.y
if c.y float32(0.0031308)
else float32(1.055) * powed.y - float32(0.055),
float32(12.92) * c.z
if c.z float32(0.0031308)
else float32(1.055) * powed.z - float32(0.055),
)
@cuda.jit(device=True, fast_math=True)
defmake_color(c):
srgb= toSRGB(clamp(c, float32(0.0), float32(1.0)))
return make_uchar4(
quantizeUnsigned8Bits(srgb.x),
quantizeUnsigned8Bits(srgb.y),
quantizeUnsigned8Bits(srgb.z),
uint8(255),
)
# In __raygen__rg
result = make_float3(
int_as_float(payload_pack.p0),
int_as_float(payload_pack.p1),
int_as_float(payload_pack.p2),
)
# Record results in your output raster
params.image[idx.y * params.image_width + idx.x] = make_color(result)
Figure 4 shows the final rendered result.
Figure 4. Final result
Summary
PyOptiX enables you to set up a ray-tracing rendering pipeline with Python. Numba converts Python functions into device code compatible with the rendering pipeline. NVIDIA combined these two libraries into the Numba extension for PyOptiX, enabling you to write accelerated ray-tracing applications in a full Python environment.
Combined with the rich and active environment that Python already enjoys, you now unlock the real power to build ray-tracing applications, hardware-accelerated. Download the demo to experiment with the Numba extension for PyOptiX yourself!
What’s next?
The PyOptiX Numba extension is at the development stage, and NVIDIA is working to add more examples and make typings for NVIDIA OptiX primitives more flexible and Pythonic.
What will you create? A game? A film? Or the VR application that you dreamt about? Share it in the comments!
ratings = tfds.load(‘movielens/100k-ratings’, split=”train”)
movies = tfds.load(‘movielens/100k-movies’, split=”train”)
# Select the basic features.
ratings = ratings.map(lambda x: {
“movie_title”: x[“movie_title”],
“user_id”: x[“user_id”],
“user_rating”: x[“user_rating”],
})
movies = movies.map(lambda x: x[“movie_title”])
# Randomly shuffle data and split between train and test.
tf.random.set_seed(42)
shuffled = ratings.shuffle(100_000, seed=42, reshuffle_each_iteration=False)
train = shuffled.take(80_000)
test = shuffled.skip(80_000).take(20_000)
movie_titles = movies.batch(1_000)
user_ids = ratings.batch(1_000_000).map(lambda x: x[“user_id”])
unique_movie_titles = np.unique(np.concatenate(list(movie_titles)))
unique_user_ids = np.unique(np.concatenate(list(user_ids)))
class MovielensModel(tfrs.models.Model):
def __init__(self, rating_weight: float, retrieval_weight: float) -> None:
# We take the loss weights in the constructor: this allows us to instantiate
# several model objects with different loss weights.
super().__init__()
embedding_dimension = 32
# User and movie models.
self.movie_model: tf.keras.layers.Layer = tf.keras.Sequential([
tf.keras.layers.StringLookup(
vocabulary=unique_movie_titles, mask_token=None),
tf.keras.layers.Embedding(len(unique_movie_titles) + 1, embedding_dimension)
])
self.user_model: tf.keras.layers.Layer = tf.keras.Sequential([
tf.keras.layers.StringLookup(
vocabulary=unique_user_ids, mask_token=None),
tf.keras.layers.Embedding(len(unique_user_ids) + 1, embedding_dimension)
])
# A small model to take in user and movie embeddings and predict ratings.
# We can make this as complicated as we want as long as we output a scalar
# as our prediction.
self.rating_model = tf.keras.Sequential([
tf.keras.layers.Dense(256, activation=”relu”),
tf.keras.layers.Dense(128, activation=”relu”),
tf.keras.layers.Dense(1),
])
# The tasks.
self.rating_task: tf.keras.layers.Layer = tfrs.tasks.Ranking(
loss=tf.keras.losses.MeanSquaredError(),
metrics=[tf.keras.metrics.RootMeanSquaredError()],
)
self.retrieval_task: tf.keras.layers.Layer = tfrs.tasks.Retrieval(
metrics=tfrs.metrics.FactorizedTopK(
candidates=movies.batch(128).map(self.movie_model)
)
)
# The loss weights.
self.rating_weight = rating_weight
self.retrieval_weight = retrieval_weight
def call(self, features: Dict[Text, tf.Tensor]) -> tf.Tensor:
# We pick out the user features and pass them into the user model.
user_embeddings = self.user_model(features[“user_id”])
# And pick out the movie features and pass them into the movie model.
movie_embeddings = self.movie_model(features[“movie_title”])
return (
user_embeddings,
movie_embeddings,
# We apply the multi-layered rating model to a concatentation of
# user and movie embeddings.
self.rating_model(
tf.concat([user_embeddings, movie_embeddings], axis=1)
),
)
def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:
ratings = features.pop(“user_rating”)
user_embeddings, movie_embeddings, rating_predictions = self(features)
# We compute the loss for each task.
rating_loss = self.rating_task(
labels=ratings,
predictions=rating_predictions,
)
retrieval_loss = self.retrieval_task(user_embeddings, movie_embeddings)
# And combine them using the loss weights.
return (self.rating_weight * rating_loss
+ self.retrieval_weight * retrieval_loss)
model = MovielensModel(rating_weight=1.0, retrieval_weight=0.0)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
Your TensorFlow version is newer than 2.4.0 and so graph support has been removed in eager mode and some static graphs may not be supported. See PR #1483 for discussion. ————————————————————————— TypeError Traceback (most recent call last) <ipython-input-17-3ea83d94ac4d> in <module>() 7 import shap 8 background=train_np[np.random.choice(train_np.shape[0],100,replace=False)] —-> 9 explainer=shap.DeepExplainer(model,background) 10 #explainer=shap.DeepExplainer((model.layers[0].input,model.layers[-1].output),background) 2 frames/usr/local/lib/python3.7/dist-packages/shap/explainers/tf_utils.py in _get_model_output(model) 83 isinstance(model, tf.keras.Model): 84 if len(model.layers[-1]._inbound_nodes) == 0: —> 85 if len(model.outputs) > 1: 86 warnings.warn(“Only one model output supported.”) 87 return model.outputs[0] TypeError: object of type ‘NoneType’ has no len()
Gamers know NVIDIA powers great gaming experiences. Researchers know NVIDIA speeds world-changing breakthroughs. Businesses know us for the AI engines transforming their industries. And NVIDIA employees know the company as one of the best places to work on the planet. More people than ever have a piece of NVIDIA. Roboticists, visual artists, data scientists — Read article >
Monitor DPUs, validate RoCE deployments, gain network insights through flow-based telemetry analysis, and centrally view network events with NetQ 4.2.0.
NVIDIA NetQ is a highly scalable, modern networking operations tool providing actionable visibility for the NVIDIA Spectrum Ethernet platform. It combines advanced telemetry with a user interface, making it easier to troubleshoot and automate network workflows while reducing maintenance and downtime.
We have recently released NetQ 4.2.0, which includes:
With NetQ 4.2, we have simplified the way network events are communicated through the interface. Events vary in terms of severity—some events are network alarms that may require further investigation, while others are informational notices that may not require intervention. Before this release, NetQ displayed alarms and information events as two separate cards. The NetQ 4.2 release merges the two cards into a single card that, when expanded, displays a dashboard to help you quickly visualize all network events.
Figure 1. NetQ events dashboard
The dashboard presents a timeline of events alongside the switches that are causing the most events. You can filter events by type, including interface, network services, system, and threshold-crossing events.
Acknowledging events helps you focus on active events that need your attention. From the dashboard, you can also create rules to suppress events. This feature is also designed to help you focus on active events, so that known issues or false alarms are not displayed in the same way that errors are displayed.
Enhanced flow telemetry analysis
NetQ 4.1.0 introduced fabric-wide network latency and buffer occupancy analysis for Cumulus Linux 5.x data center fabrics. Now, NetQ 4.2 supports partial-path flow telemetry analysis in mixed fabrics—those that use Cumulus Linux 5.x switches in combination with other switches (including non-Cumulus Linux 5.x and third-party switches). Cumulus Linux 5.x devices in the path display flow statistics, such as latency and buffer occupancy. Unsupported devices are represented in the flow analysis as a black bar with a red X, and the device does not display flow statistics.
Figure 2. NetQ flow telemetry analysis results
In addition, NetQ 4.2 flow telemetry analysis shows contextual ‘What Just Happened’ (WJH) events and drops for the flow under analysis. Switches with WJH events are represented in the flow analysis graph as a red, striped bar. Hovering over the device with the red bar presents a WJH events summary.
Figure 3. NetQ flow telemetry analysis with WJH data
New RoCE validation
With RDMA over Converged Ethernet (RoCE), you can write to compute or storage elements using remote direct memory access (RDMA) over an Ethernet network instead of using host CPUs. NetQ 4.0.0 introduced RoCE configuration and counters, including the ability to set up various RoCE threshold-crossing alerts (TCAs).
With NetQ 4.2.0, RoCE validation checks:
Lossy- or lossless-mode configuration consistency across switches
Consistency of DSCP, service pool, port group, and traffic class settings
Consistency of ECN threshold settings
Consistency of PFC configuration for lossless mode
Consistency of Enhanced Transmission Selection settings
You can schedule RoCE validation to run periodically or on-demand.
New DPU monitoring
NVIDIA BlueField data processing units (DPUs) provide a secure and accelerated infrastructure for any workload by offloading, accelerating, and isolating a broad range of advanced networking, storage, and security services.
NetQ helps you monitor your DPU inventory across the network. You can monitor a DPU OS, ASIC, CPU model, disk, and memory information to help manage upgrades, compliance, and other planning tasks. With NetQ, you can view and monitor key DPU attributes, including installed packages and CPU, disk, and memory utilization.
Figure 4. NetQ-DPU utilization details
In this post, you have seen an overview of some of the new capabilities available with NetQ 4.2.0. For more information, see the NetQ 4.2.0 User’s Guide and explore NetQ with NVIDIA Air.
I only need the tensorflow lite example model to detect cars and people, but it detects Many more types of objects. Is there any way to make it detect just these two