Categories
Misc

Attention pooling layer in Keras

I’m working with tf.keras on a Machine Learning project, and I’d like to implement an Attention pooling layer.

The equation which describes it is in Table I of this paper (it’s indicated at the last row of the table, at the column “Pooling function”).

The paper also says:

in the attention pooling function, the weights for each frame w_i are learned with a dedicated layer in the network.

I tried to implement the Attention pooling layer, in tf.keras, by (after reading this Keras documentation page) subclassing the Keras Layer class, such as:

from tensorflow.keras import backend as K from tensorflow.python.keras.engine.base_layer import Layer from tensorflow.python.keras.engine.input_spec import InputSpec class AttentionPooling1D(Layer): def __init__(self, axis=0, **kwargs): super(AttentionPooling1D, self).__init__(**kwargs) self.axis = axis def build(self, input_shape): input_dim = input_shape[-1] self.w= self.add_weight(shape=(1, input_dim), name='w') def get_config(self): config = {'axis': self.axis} base_config = super(AttentionPooling1D, self).get_config() return dict(list(base_config.items()) + list(config.items())) def call(self, x, mask=None): product = x * self.w numerator = K.sum(product, axis=self.axis, keepdims=True) denominator = K.sum(x, axis=self.axis, keepdims=True) attention_output = numerator / denominator return attention_output 

I don’t if it is correct or not, so I post it here in order to have feedbacks, especially if there are any errors and/or I’m missing something.

submitted by /u/RainbowRedditForum
[visit reddit] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *