I’m trying to implement this Transformer article. In the source code they pass a number (n_heads) representing the number of attention heads that should be created, which is used when building the model to create that number of attention heads and save them to a list. Later, when the model is called, the attention heads are iterated over as follows: attn = [self.attn_heads[i](inputs) for i in range(self.n_heads)]. When running this code in graph mode the following error is thrown: OperatorNotAllowedInGraphError: iterating over “tf.Tensor” is not allowed: AutoGraph did convert this function.
How should I go about creating an arbitrary number of the same layer in such a way that it can be iterated over in graph mode?