Categories
Misc

Attention pooling layer in Keras

I’m working with tf.keras on a Machine Learning project, and I’d like to implement an Attention pooling layer.

The equation which describes it is in Table I of this paper (it’s indicated at the last row of the table, at the column “Pooling function”).

The paper also says:

in the attention pooling function, the weights for each frame w_i are learned with a dedicated layer in the network.

I tried to implement the Attention pooling layer, in tf.keras, by (after reading this Keras documentation page) subclassing the Keras Layer class, such as:

from tensorflow.keras import backend as K from tensorflow.python.keras.engine.base_layer import Layer from tensorflow.python.keras.engine.input_spec import InputSpec class AttentionPooling1D(Layer): def __init__(self, axis=0, **kwargs): super(AttentionPooling1D, self).__init__(**kwargs) self.axis = axis def build(self, input_shape): input_dim = input_shape[-1] self.w= self.add_weight(shape=(1, input_dim), name='w') def get_config(self): config = {'axis': self.axis} base_config = super(AttentionPooling1D, self).get_config() return dict(list(base_config.items()) + list(config.items())) def call(self, x, mask=None): product = x * self.w numerator = K.sum(product, axis=self.axis, keepdims=True) denominator = K.sum(x, axis=self.axis, keepdims=True) attention_output = numerator / denominator return attention_output 

I don’t if it is correct or not, so I post it here in order to have feedbacks, especially if there are any errors and/or I’m missing something.

submitted by /u/RainbowRedditForum
[visit reddit] [comments]

Categories
Offsites

Separating Birdsong in the Wild for Classification

Birds are all around us, and just by listening, we can learn many things about our environment. Ecologists use birds to understand food systems and forest health — for example, if there are more woodpeckers in a forest, that means there’s a lot of dead wood. Because birds communicate and mark territory with songs and calls, it’s most efficient to identify them by ear. In fact, experts may identify up to 10x as many birds by ear as by sight.

In recent years, autonomous recording units (ARUs) have made it easy to capture thousands of hours of audio in forests that could be used to better understand ecosystems and identify critical habitat. However, manually reviewing the audio data is very time consuming, and experts in birdsong are rare. But an approach based on machine learning (ML) has the potential to greatly reduce the amount of expert review needed for understanding a habitat.

However, ML-based audio classification of bird species can be challenging for several reasons. For one, birds often sing over one another, especially during the “dawn chorus” when many birds are most active. Also, there aren’t clear recordings of individual birds to learn from — almost all of the available training data is recorded in noisy outdoor conditions, where other sounds from the wind, insects, and other environmental sources are often present. As a result, existing birdsong classification models struggle to identify quiet, distant and overlapping vocalizations. Additionally, some of the most common species often appear unlabeled in the background of training recordings for less common species, leading models to discount the common species. These difficult cases are very important for ecologists who want to identify endangered or invasive species using automated systems.

To address the general challenge of training ML models to automatically separate audio recordings without access to examples of isolated sounds, we recently proposed a new unsupervised method called mixture invariant training (MixIT) in our paper, “Unsupervised Sound Separation Using Mixture Invariant Training”. Moreover, in our new paper, “Improving Bird Classification with Unsupervised Sound Separation,” we use MixIT training to separate birdsong and improve species classification. We found that including the separated audio in the classification improves precision and classification quality on three independent soundscape datasets. We are also happy to announce the open-source release of the birdsong separation models on GitHub.

Birdsong Audio Separation
MixIT learns to separate single-channel recordings into multiple individual tracks, and can be trained entirely with noisy, real-world recordings. To train the separation model, we create a “mixture of mixtures” (MoM) by mixing together two real-world recordings. The separation model then learns to take the MoM apart into many channels to minimize a loss function that uses the two original real-world recordings as ground-truth references. The loss function uses these references to group the separated channels such that they can be mixed back together to recreate the two original real-world recordings. Since there’s no way to know how the different sounds in the MoM were grouped together in the original recordings, the separation model has no choice but to separate the individual sounds themselves, and thus learns to place each singing bird in a different output audio channel, also separate from wind and other background noise.

We trained a new MixIT separation model using birdsong recordings from Xeno-Canto and the Macaulay Library. We found that for separating birdsong, this new model outperformed a MixIT separation model trained on a large amount of general audio from the AudioSet dataset. We measure the quality of the separation by mixing two recordings together, applying separation, and then remixing the separated audio channels such that they reconstruct the original two recordings. We measure the signal-to-noise ratio (SNR) of the remixed audio relative to the original recordings. We found that the model trained specifically for birds achieved 6.1 decibels (dB) better SNR than the model trained on AudioSet (10.5 dB vs 4.4 dB). Subjectively, we also found many examples where the system worked incredibly well, separating very difficult to distinguish calls in real-world data.

The following videos demonstrate separation of birdsong from two different regions (Caples and the High Sierras). The videos show the mel-spectrogram of the mixed audio (a 2D image that shows the frequency content of the audio over time) and highlight the audio separated into different tracks.

High Sierras
  
Caples

Classifying Bird Species
To classify birds in real-world audio captured with ARUs, we first split the audio into five-second segments and then create a mel-spectrogram of each segment. We then train an EfficientNet classifier to identify bird species from the mel-spectrogram images, training on audio from Xeno-Canto and the Macaulay Library. We trained two separate classifiers, one for species in the Sierra Nevada mountains and one for upstate New York. Note that these classifiers are not trained on separated audio; that’s an area for future improvement.

We also introduced some new techniques to improve classifier training. Taxonomic training asks the classifier to provide labels for each level of the species taxonomy (genus, family, and order), which allows the model to learn groupings of species before learning the sometimes-subtle differences between similar species. Taxonomic training also allows the model to benefit from expert information about the taxonomic relationships between different species. We also found that random low-pass filtering was helpful for simulating distant sounds during training: As an audio source gets further away, the high-frequency parts fade away before the low-frequency parts. This was particularly effective for identifying species from the High Sierras region, where birdsongs cover very long distances, unimpeded by trees.

Classifying Separated Audio
We found that separating audio with the new MixIT model before classification improved the classifier performance on three independent real-world datasets. The separation was particularly successful for identification of quiet and background birds, and in many cases helped with overlapping vocalizations as well.

Top: A mel-spectrogram of two birds, an American pipit (amepip) and gray-crowned rosy finch (gcrfin), from the Sierra Nevadas. The legend shows the log-probabilities for the two species given by the pre-trained classifiers. Higher values indicate more confidence, and values greater than -1.0 are usually correct classifications. Bottom: A mel-spectrogram for the automatically separated audio, with the classifier log probabilities from the separated channels. Note that the classifier only identifies the gcrfin once the audio is separated.
Top: A complex mixture with three vocalizations: A golden-crowned kinglet (gockin), mountain chickadee (mouchi), and Steller’s jay (stejay). Bottom: Separation into three channels, with classifier log probabilities for the three species. We see good visual separation of the Steller’s jay (shown by the distinct pink marks), even though the classifier isn’t sure what it is.

The separation model does have some potential limitations. Occasionally we observe over-separation, where a single song is broken into multiple channels, which can cause misclassifications. We also notice that when multiple birds are vocalizing, the most prominent song often gets a lower score after separation. This may be due to loss of environmental context or other artifacts introduced by separation that do not appear during classifier training. For now, we get the best results by running the classifier on the separated channels and the original audio, and taking the maximum score for each species. We expect that further work will allow us to reduce over-separation and find better ways to combine separation and classification. You can see and hear more examples of the full system at our GitHub repo.

Future Directions
We are currently working with partners at the California Academy of Sciences to understand how habitat and species mix changes after prescribed fires and wildfires, applying these models to ARU audio collected over many years.

We also foresee many potential applications for the unsupervised separation models in ecology, beyond just birds. For example, the separated audio can be used to create better acoustic indices, which could measure ecosystem health by tracking the total activity of birds, insects, and amphibians without identifying particular species. Similar methods could also be adapted for use underwater to track coral reef health.

Acknowledgements
We would like to thank Mary Clapp, Jack Dumbacher, and Durrell Kapan from the California Academy of Sciences for providing extensive annotated soundscapes from the Sierra Nevadas. Stefan Kahl and Holger Klinck from the Cornell Lab of Ornithology provided soundscapes from Sapsucker Woods. Training data for both the separation and classification models came from Xeno-Canto and the Macaulay Library. Finally, we would like to thank Julie Cattiau, Lauren Harrell, Matt Harvey, and our co-author, John Hershey, from the Google Bioacoustics and Sound Separation teams.

Categories
Misc

Meta Works with NVIDIA to Build Massive AI Research Supercomputer

Meta Platforms gave a big thumbs up to NVIDIA, choosing our technologies for what it believes will be its most powerful research system to date. The AI Research SuperCluster (RSC), announced today, is already training new models to advance AI. Once fully deployed, Meta’s RSC is expected to be the largest customer installation of NVIDIA Read article >

The post Meta Works with NVIDIA to Build Massive AI Research Supercomputer appeared first on The Official NVIDIA Blog.

Categories
Misc

How the Intelligent Supply Chain Broke and AI Is Fixing It

Let’s face it, the global supply chain may not be the most scintillating subject matter. Yet in homes and businesses around the world, it’s quickly become the topic du jour: empty shelves; record price increases; clogged ports and sick truckers leading to disruptions near and far. The business of organizing resources to supply a product Read article >

The post How the Intelligent Supply Chain Broke and AI Is Fixing It appeared first on The Official NVIDIA Blog.

Categories
Misc

Brain Tumor Segmentation and Classification using ResUnet

Brain Tumor Segmentation and Classification using ResUnet submitted by /u/Sudo_Python
[visit reddit] [comments]
Categories
Misc

How do I slove this error?

I was doing an exercise by google dev’s ml tensorflow course. Im getting this error:

File “c:UsersshivaDocumentsAI_ML_TensorflowTensorflowEx2MNISTComputerVision.py“, line 24, in <module>

model.fit(x_train, y_train, epochs=5)

TypeError: Expected uint8, but got 1e-07 of type ‘float’.

———————————————————————————————————————————————

Here is the code:

# YOUR CODE SHOULD START HERE
# YOUR CODE SHOULD END HERE
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
# YOUR CODE SHOULD START HERE
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu),

tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
# YOUR CODE SHOULD END HERE
model = tf.keras.models.Sequential([
# YOUR CODE SHOULD START HERE

# YOUR CODE SHOULD END HERE
])
model.compile(optimizer = tf.keras.optimizers.Adam(),
loss = ‘sparse_categorical_crossentropy’,
metrics=[‘accuracy’])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
# YOUR CODE SHOULD START HERE
# YOUR CODE SHOULD END HERE

———————————————————————————————————————————————

I dunno what shud I do? I checked my code but can’t find anything that might cause the error. I asked the same question on the r/learnmachinelearning but got no response, Pls help!

submitted by /u/StarLan7
[visit reddit] [comments]

Categories
Misc

Any open source TensorFlow training orchestrator /dashboard?

hello r/tensorflow. As a backend engineer, I am very unfamiliar with tensorflow and ML in general, so please forgive me if this question seems unreasonable to you.

Because of the need of my lab, I’ve been looking for a solution for tensorflow orchestration. We have one server with a powerful GPU, and several users who want to run their tensorflow jobs on that powerful GPU. Instead of making schedules offline and individually log in to the server, is there any open source project I can deploy to the server that serves as an orchestrator?

For example, it provides a simple WebUI to let the user upload their job and all necessary files. Then the user submits the job to add it to a queue, which will run when it’s the first in the line. It will also report the progress and the result of the job.

I think there should be some kind of open-sourced project out there that fits this need, but I haven’t found it yet. So please help.

submitted by /u/xcsublime
[visit reddit] [comments]

Categories
Misc

How to disable exceptions encountered when calling Lambda layer?

I’m experimenting with some logic before creating a custom keras layer, but my Lambda layer isn’t allowing me to check the output shape with model.summary(). It says:

ValueError: Exception encountered when calling layer “Lambda_1” (type Lambda).

The following Variables were created within a Lambda layer (Lambda_1)

but are not tracked by said layer:

<tf.Variable ‘Lambda_1/map/while/RGAT_1/edge_type_0/kernel:0’ shape=(7, 10) dtype=float32>

<tf.Variable ‘Lambda_1/map/while/RGAT_1/edge_type_0/Edge_attention_parameters_0:0’ shape=(5, 4) dtype=float32>

The layer cannot safely ensure proper Variable reuse across multiple

calls, and consquently this behavior is disallowed for safety. Lambda

layers are not well suited to stateful computation; instead, writing a

subclassed Layer is the recommend way to define layers with

Variables.

Is there a way to temporally disable this behavior? 🤔

submitted by /u/jorvan758
[visit reddit] [comments]

Categories
Misc

Could you help to combine input layers with a specific NamedTuple class?

Hello, I’ve been searching/reading for a fair amount of hours, but I’m pretty much stuck with this problem.

This is my code:

from typing import NamedTuple class MessagePassingInput(NamedTuple): node_embeddings: tf.Tensor adjacency_lists: Tuple[tf.Tensor, ...] from keras import Model, layers import tensorflow as tf inputLayer_X = layers.Input(shape=tf.TensorShape(dims=(None, 7)),name="Input_X") inputLayer_A1 = layers.Input(shape=tf.TensorShape(dims=(None, 2)),name="Input_A1", dtype=tf.int32) inputLayer_A2 = layers.Input(shape=tf.TensorShape(dims=(None, 2)),name="Input_A2", dtype=tf.int32) inputLayer_A3 = layers.Input(shape=tf.TensorShape(dims=(None, 2)),name="Input_A3", dtype=tf.int32) 

And I would like that every entry in those inputs ends up in a next layer more or less like this: newLayer = [MessagePassingInput(inputLayer_X[i], [inputLayer_A1[i], inputLayer_A2[i], inputLayer_A3[i]]) for i in range(len(inputLayer_X))]. However, I’m just not being able to find how (I have tried with tf.map_fn and layers.Lambda, but wasn’t able to feed all those input layers and use the function in order)

If you could help me, I would be very grateful 🙏

submitted by /u/jorvan758
[visit reddit] [comments]

Categories
Misc

The Official Feedback and Discussion Thread

Here you can discuss anything that doesn’t require its own post

submitted by /u/TheNASAguy
[visit reddit] [comments]